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(57) Abstract: A method is disclosed for encapsulat- 
ing plasmids, oligonucleotides or negatively-charged 
drugs into liposomes having a different lipid 
composition between their inner and outer membrane 
bi layers and able to reach primary tumors and their 
metastases after intravenous injection to animals 
and humans. The formulation method includes 
complex formation between DNA with cationic lipid 
molecules and fusogenic/NLS peptide conjugates 
composed of a hydrophobic chain of about 10-20 
amino acids and also containing four or more histidine 
residues or NLS at their one end. The encapsulated 
molecules display therapeutic efficacy in eradicating 
a variety of solid human tumors including but not 
limited to breast carcinoma and prostate carcinoma. 
Combination of the plasmids, oligonucleotides or 
negatively-charged drugs with other anti -neoplastic 
drugs (the positively-charged cis-platin, doxorubicin) 
encapsulated into liposomes are of therapeutic value- 
Also of therapeutic value in cancer eradication 
are combinations of encapsulated the plasmids, 
oligonucleotides or negatively-charged drugs with 
HSV-tk plus encapsulated ganciclovir. 
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ENCAPSULATION OF PLASMID DNA (LIPOGENES™) AND 
THERAPEUTIC AGENTS WITH NUCLEAR LOCALIZATION 
SIGNAL/FUSOGENIC PEPTIDE CONJUGATES INTO TARGETED 
5 LIPOSOME COMPLEXES 

CROSS-REFERENCE TO RELATED APPLICATIONS 

This application claims priority under 35 U.S.C. § 1 19(e) to U.S. Provisional 
Application Serial No. 60/210,925 filed June 9, 2000. The contents of this 
10 application is hereby incorporated by reference into the present disclosure. 

FIELD OF THE INVENTION 

The present invention relates to the field of gene therapy and is specifically 
directed toward methods for producing peptide-lipid-polynucleotide complexes 
15 suitable for delivery of polynucleotides to a subject. The peptide-lipid- 
polynucleotide complexes so produced are useful in a subject for inhibiting the 
progression of neoplastic disease. 

BACKGROUND OF THE INVENTION 

20 Throughout this application various publications, patents and published 

patent specifications are referenced by author and date or by an identifying patent 
number. Full bibliographical citations for the publications are provided immediately 
preceding the claims. The disclosures of these publications, patents and published 
patent specifications are hereby incorporated by reference into the present disclosure 

25 to more fully describe the state of the art to which this invention pertains. 

Gene therapy is a newly emerging field of biomedical research that holds 
great promise for the treatment of both acute and chronic diseases and has the 
potential to bring a revolutionary era to molecular medicine. However, despite 
numerous preclinical and clinical studies, routine use of gene therapy for the 

30 treatment of human disease has not yet been perfected. It remains an important 
unmet need of gene therapy to create gene delivery systems that effectively target 
specific cells of interest in a subject while controlling harmful side effects. 



1 



WO 01/93836 PCT/US01/18657 



Gene therapy is aimed at introducing therapeutically important genes into 
somatic cells of patients. Diseases already shown to be amenable to therapy with 
gene transfer in clinical trials include, cancer (melanoma, breast, lymphoma, head 
and neck, ovarian, colon, prostate, brain, chronic myelogenous leukemia, non-small 
5 cell lung, lung adenocarcinoma, colorectal, neuroblastoma, glioma, glioblastoma, 
astrocytoma, and others), AIDS, cystic fibrosis, adenosine deaminase deficiency, 
cardiovascular diseases (restenosis, familial hypercholesterolemia, peripheral artery 
disease), Gaucher disease, a 1 -antitrypsin deficiency, rheumatoid arthritis and others. 
Human diseases expected to be the object of clinical trials include hemophilia A and 

10 B, Parkinson's disease, ocular diseases, xeroderma pigmentosum, high blood 

pressure, obesity. ADA deficiency was the disease successfully treated by the first 
human "gene transfer" experiment conducted by Kenneth Culver in 1990. See, 
Culver, K.W. (1996) in: Gene Therapy: A Primer for Physicians, Second Ed., Mary 
Ann Liebert, Inc. Publ, New York, pp. 1-198. 

1 5 The primary goals of gene therapy are to repair or replace mutated genes, 

regulate gene expression and signal transduction, manipulate the immune system, or 
target malignant and other cells for destruction. See, Anderson, W.F. (1992) Science 
25(5:808-813; Lasic, D. (1997) in: Liposomes in Gene Delivery, CRC Press, pp. 1- 
295; Boulikas, T. (1998) Gene Ther. Mol Biol 7:1-172; Martin, F. and Boulikas, T. 

20 (1998) Gene Ther. Mol. Biol 7:173-214; Ross, G. et al. (1996) Hum. Gene Ther. 
7:1781-1790. 

Human cancer presents a particular disease condition for which effective 
gene therapy methods would provide a particularly useful clinical benefit. Gene 
therapy concepts for treatment of such diseases include stimulation of immune 

25 responses as well as manipulation of a variety of alternative cellular functions that 
affect the malignant phenotype. Although many human tumors are non or weakly 
immunogenic, the immune system can be reinforced and instructed to eliminate 
cancer cells after transduction of a patient's cells ex vivo with the cytokine genes 
GM-CSF, IL-12, IL-2, IL-4, IL-7, EFN-y, and TNF-a, followed by cell vaccination of 

30 the patient {e.g. intradermally) to potentiate T-lymphocyte-mediated antitumor 
effects (cancer immunotherapy). DNA vaccination with genes encoding tumor 
antigens and immunotherapy with synthetic tumor peptide vaccines are further 
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developments that are currently being tested. The genes used for cancer gene 
therapy in human clinical trials include a number of tumor suppressor genes (p53, 
RB, BRCA1, El A), antisense oncogenes (antisense c-fos, c-myc, K-ras), and suicide 
genes (HSV-tk, in combination with ganciclovir, cytosine deaminase in combination 
5 with 5-fluorocytosine). Other important genes that have been proposed for cancer 
gene therapy include bcl-2, MDR-1, p21, pl6, bax, bcl-xs, E2F, IGF-I, VEGF, 
angiostatin, CFTR, LDL-R, TGF-p, and leptin. One major hurdle preventing 
successful implementation of these gene therapies is the difficulty of efficiently 
delivering an effective dose of polynucleotides to the site of the tumor. Thus, gene 
10 delivery systems with enhanced transfection capabilities would be highly 
advantageous. 

A number of different vector technologies and gene delivery methods have 
been proposed and tested for delivering genes in vivo, including viral vectors and 
various nucleic acid encapsulation techniques. Alternative viral delivery vehicles for 

15 genes include murine retroviruses, recombinant adenoviral vectors, adeno-associated 
virus, HSV, EBV, HIV vectors, and baculovirus. Nonviral gene delivery methods 
use cationic or neutral liposomes, direct injection of plasmid DNA, and polymers. 
Various strategies to enhance efficiency of gene transfer have been tested such as 
fiisogenic peptides in combination with liposomes or polymers to enhance the 

20 release of plasmid DNA from endosomes. 

Each of the various gene delivery techniques has been found to possess 
different strengths and weaknesses. Recombinant retroviruses stably integrate into 
the chromosome but require host DNA synthesis to insert. Adenoviruses can infect 
non-dividing cells but cause immune reactions leading to the elimination of 

25 therapeutically transduced cells. Adeno-associated virus (AAV) is not pathogenic 
and does not elicit immune responses but new production strategies are required to 
obtain high AAV titers for preclinical and clinical studies. Wild-type AAVs 
integrate into chromosome 19, whereas recombinant AAVs are deprived of site- 
specific integration and may also persist episomally. 

30 Herpes Simplex Virus (HSV) vectors can infect non-replicating cells, such as 

neuronal cells, and has a high payload capacity for foreign DNA but inflict cytotoxic 
effects. It seems that each delivery system will be developed independently of the 
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others and that each will demonstrate strengths and weaknesses for certain 
applications. At present, retroviruses are most commonly used in human clinical 
trials, followed by adenoviruses, cationic liposomes and AAV. 

As the challenges of perfecting gene therapy techniques have become 
5 apparent, a variety of additional delivery systems have been proposed to circumvent 
the difficulties observed with standard technologies. For example, cell-based gene 
delivery using polymer-encapsulated syngeneic or allogeneic cells implanted into a 
tissue of a patient can be used to secrete therapeutic proteins. This method is being 
tested in trials for amyotrophic lateral sclerosis using the ciliary neurotrophic factor 

10 gene, and may be extended to Factor VIII and IX for hemophilia, interleukin genes, 
dopamine-secreting cells to treat Parkinson's disease, nerve growth factor for 
Alzheimer's disease and other diseases. Other techniques under development 
include, vectors with the Cre-LoxP recombinase system to rid transfected cells of 
undesirable viral DNA sequences, use of tissue-specific promoters to express a gene 

15 in a particular cell type, or use of ligands recognizing cell surface molecules to direct 
gene vehicles to a particular cell type. 

Additional methods that have been proposed for improving the efficacy of 
gene therapy technologies include designing p53 "gene bombs'* that explode into 
tumor cells, exploiting the HIV-1 virus to engineer vectors for gene transfer, 

20 combining viruses with polymers or cationic lipids to improve gene transfer, the 

attachment of nuclear localization signal peptides to oligonucleotides to direct genes 
to nuclei, and the development of molecular switch systems allowing genes to be 
turned on or off at will. Nevertheless, because of the wide range of disease 
conditions for which gene therapies are required, and the complexities of developing 

25 treatments for such diseases, there remains a need for improved techniques for 

performing gene therapy. The present invention provides methods and compositions 
for addressing these issues. 



DISCLOSURE OF THE INVENTION 

30 A method is disclosed for encapsulating DNA and negatively charged drugs 

into liposomes having a different lipid composition between their inner and outer 
membrane bilayers. The liposomes are able to reach primary tumors and their 
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metastases after intravenous injection to animals and humans. The method includes 
micelle formation between DNA with a mixture of cationic lipid and peptide 
molecules at molar ratios to nearly neutralization ratios in 10-90% ethanol; the 
cationic peptides specify nuclear localization and have a hydrophobic moiety 
5 endowed with membrane fusion to improve entrance across the cell membrane of the 
complex. These peptides insert with their cationic portion directed toward 
condensed DNA and their hydrophobic chain buried together with the hydrophobic 
chains of the lipids in the micelle membrane monolayer. The DNA/lipid/peptide 
micelles are converted into liposomes by mixing with pre-made liposomes or lipids 

10 followed by dilution in aqueous solutions and dialysis to remove the ethanol and 

allow liposome formation and extrusion through membranes to a diameter below 1 60 
nm entrapping and encapsulating DNA with a very high yield. The encapsulated 
DNA has a high therapeutic efficacy in eradicating a variety of solid human tumors 
including, but not limited to, breast carcinoma and prostate carcinoma. A plasmid is 

15 constructed with DNA carrying anticancer genes including, but not limited to p53, 
RB, BRCA1, E1A, bcl-2, MDR-1, p21, pl6, bax, bcl-xs, E2F, IGF-I VEGF, 
angiostatin, oncostatin, endostatin, GM-CSF, IL-12, IL-2, IL-4, IL-7, IFN-y, TNF-a, 
HSV-tk (in combination with ganciclovir), E. coli cytosine deaminase (in 
combination with 5-fluorocytosine) and is combined with encapsulated cisplatin or 

20 with other similarly systemically delivered antineoplastic drugs to suppress cancer. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 illustrates the structure of the cancer targeted liposome complex. 

FIG. 2 illustrates the results of plasmid DNA condensation with various 
25 agents as well as various formulation of cationic liposomes in affecting the level of 
expression of the reporter beta-galactosidase gene after transfection of K562 human 
erythroleukemia cell cultures. 

FIG 3 illustrates tumor targeting in SCID mice. FIG 3A shows a SCID mouse 
with a large and small human breast tumor before and after staining with X-Gal to 
30 test the expression of the transferred gene. Both tumors turn dark blue. The 

intensity of the blue color is proportional to the expression of the beta-galactosidase 
gene. 
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FIG 3B shows that in the initial staining of the small tumor, the skin and the 
intestines at the injection area are the first organs to turn blue. FIG 3C is a view of 
the back of the animal. The two tumors are clearly visible after removal of the skin 
(top). Dark staining of the small tumor and light blue staining of the large tumor is 
5 evident at an initial stage of staining (bottom). FIG 3D is a view of the front side of 
the animal. The two tumors are clearly visible after removal of the skin. On the 
figure to the bottom the dark staining of both tumors is evident at a later stage during 
staining. 

FIG 3E shows the front (top) and rear (bottom) higher magnification view of 
10 the dark staining of both tumors at a later stage during staining. Staining of the 
vascular system around the small tumor can also be seen (bottom). 

BRIEF DESCRIPTION OF THE TABLES 

Table 1 is a list of molecules able to form micelles. 
15 Table 2 lists several fusogenic peptides and describes their properties, along 

with a reference. 

Table 3 lists simple Nuclear Localization Signal (NLS) peptides. 

Table 4 shows a list of "bipartite" or "split" NLS peptides. 

Table 5 lists "nonpositive NLS" peptides lacking clusters of 
20 arginines/lysines. 

Table 6 lists peptides with nucleolar localization signals (NoLS). 

Table 7 lists peptides having karyophilic clusters on non-membrane protein 
kinases. 

Table 8 lists peptide nuclear localization signals on DNA repair proteins. 
25 Table 9 lists NLS peptides in transcription factors. 

Table 10 lists NLS peptides in other nuclear proteins. 

MODES FOR CARRYING OUT THE INVENTION 
Definitions 

30 The practice of the present invention will employ, unless otherwise indicated, 

conventional techniques of immunology, molecular biology, microbiology, cell 
biology and recombinant DNA. These methods are described in the following 
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publications. See, e.g., Sambrook, et al. MOLECULAR CLONING: A LABORATORY 
MANUAL, 2 nd Edition (1989); CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, F.M. 

Ausubel, et al. eds., (1987); the series METHODS IN ENZYMOLOGY (Academic Press, ... 

Inc.); pcr: a practical approach, M. MacPherson, et al., IRL Press at Oxford 
5 University Press (1991); pcr 2: A practical approach, MacPherson et al., eds. 

(1995); antibodies, A LABORATORY manual, Harlow and Lane, eds. (1988); and 

ANIMAL cell culture, R.I. Freshney, ed. (1987). 

As used in the specification and claims, the singular form "a," "an" and "the" 

include plural references unless the context clearly dictates otherwise. For example, 
10 the term "a cell" includes a plurality of cells, including mixtures thereof. 

The term "comprising" is intended to mean that the compositions and 

methods include the recited elements, but not excluding others. "Consisting 

essentially of when used to define compositions and methods, shall mean excluding 

other elements of any essential significance to the combination. Thus, a composition 
15 consisting essentially of the elements as defined herein would not exclude trace 

contaminants from the isolation and purification method and pharmaceutically 

acceptable carriers, such as phosphate buffered saline, preservatives, and the like. 

"Consisting of shall mean excluding more than trace elements of other ingredients 

and substantial method steps for administering the compositions of this invention. 
20 Embodiments defined by each of these transition terms are within the scope of this 

invention. 

The terms "polynucleotide" and "nucleic acid molecule" are used 
interchangeably to refer to polymeric forms of nucleotides of any length. The 
polynucleotides may contain deoxyribonucleotides, ribonucleotides, and/or their 

25 analogs. Nucleotides may have any three-dimensional structure, and may perform 
any function, known or unknown. The term "polynucleotide" includes, for example, 
single-, double-stranded and triple helical molecules, a gene or gene fragment, 
exons, introns, mRNA, tRNA, rRNA, ribozymes, cDNA, recombinant 
polynucleotides, branched polynucleotides, plasmids, vectors, isolated DNA of any 

30 sequence, isolated RNA of any sequence, nucleic acid probes, and primers. A 
nucleic acid molecule may also comprise modified nucleic acid molecules. 
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A "gene" refers to a polynucleotide containing at least one open reading 
frame that is capable of encoding a particular polypeptide or protein after being 
transcribed and translated. 

A "gene product" refers to the amino acid (e.g. , peptide or polypeptide) 
generated when a gene is transcribed and translated. 

The following abbreviations are used herein: DDAB: dimethyldioctadecyl 
ammonium bromide (same as N,N-distearyl-N,N-dimethylammonium bromide); 
DODAC: N,N-dioleyl-N,N-dimethylammonium chloride; DODAP: l,2-dioleoyl-3- 
dimethylammonium propane; DMRIE: N-[l-(2,3-dimyristyloxy)propyl>N,N- 
dimethyl-N-(2-hydroxy ethyl) ammonium bromide; DMTAP: 1 ,2-dimyristoyl-3- 
trimethylammonium propane; DOGS: Dioctadecylamidoglycylspermine; DOTAP 
(same as DOTMA): N-(l-(2 9 3-dioleoyloxy)propyl)-N,N,N-trimethylammonium 
chloride; DOSPA: N-(l-(2,3-dioleyloxy)propyl)-N-(2- 

(sperminecarboxamido)ethyl)-N,N-dimethyl ammonium trifluoroacetate; DPTAP: 
1,2- dipalmitoyl-3-trimethylammonium propane; DSTAP: l,2-disteroyl-3- 
trimethylammonium propane; DOPE, 1,2-sn-dioleoylphoshatidylethanolamine; 
DC-Choi, 3p-(N-(N , ,N'-dimethylaminoethane)carbamoyl)cholesterol. See, Gao et 
al,Biochem. Biophys. Res. Comm. 779:280-285 (1991). 

As used herein, the term "pharmaceutical^ acceptable anion" refers to 
anions of organic and inorganic acids that provide non-toxic salts in pharmaceutical 
preparations. Examples of such anions include the halides anions, chloride, 
bromide, and iodide, inorganic anions such as sulfate, phosphate, alnd nitrate, and 
organic anions. Organic anions may be derived from simple organic acids, such as 
acetic acid, propionic acid, glycolic acid, pyruvic acid, oxalic acid, malic acid, 
malonic acid, succinic acid, maleic, acid, fumaric acid, tartaric acid, citric acid, 
benzoic acid, cinnamic acid, mandelic acid, methane sulfonic acid, ethane sulfonic 
acid, p-toluenesulfonic acid, and the like. The preparation of pharmaceutically 
acceptable salts is described in Berge, et al., J, Pharm. Sci. 66:1-19 (1977), 
incorporated herein by reference. 

Physiologically acceptable carriers, excipients or stabilizers are nontoxic to 
recipients at the dosages and concentrations employed, and include buffers such as 
phosphate, citrate, and other organic acids; antioxidants including ascorbic acid; low 
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molecular weight (less than about 10 residues) polypeptides; proteins, such as serum 
albumin, gelatin, or immunoglobulins; hydrophilic polymers such as 
polyvinylpyrrolidone; amino acids such as glycine, glutamine, asparagine, arginine 
or lysine; monosaccharides, disaccharides, and other carbohydrates including 
5 glucose, mannose, or dextrins; chelating agents such as EDTA: sugar alcohols such 
as mannitol or sorbitol; salt-forming counter ions such as sodium; and/or nonionic 
surfactants such as Tween, Pluronics or polyethylene glycol (PEG). PEG molecules 
also contain a fusogenic peptide with an attached Nuclear Localization Signal (NLS) 
covalently linked to the end of the PEG molecule. 
10 The term "cationic lipid" refers to any of a number of lipid species that carry 

a net positive charge at physiological pH. Such lipids include, but are not limited to, 
DDAB, DMRIE, DODAC, DOGS, DOTAP, DOSPA and DC-Choi. Additionally, a 
number of commercial preparations of cationic lipids are available that can be used 
in the present invention. These include, for example, LIPOFECTIN (commercially 
15 available cationic liposomes comprising DOTMA and DOPE, from GIBCO/BRL, 
Grand Island, N.Y., USA); LIPOFECT AMINE (commercially available cationic 
liposomes comprising DOSPA and DOPE, from GIBCO/BRL); and 
TRANSFECTAM (commercially available cationic lipids comprising DOGS in 
ethanol from Promega Corp., Madison, Wis., USA). 
20 This invention further provides a number of methods for producing micelles 

with entrapped therapeutic drugs. The method is particularly useful to produce 
micelles of drugs or compositions having a net overall negative charge, e.g., DNA, 
RNA or negatively charged small molecules. For example, the DNA can be 
comprised within a plasmid vector and encode for a therapeutic protein, e.g., wild- 
25 type p53, HSV-tk, p21, Bax, Bad, IL-2, IL-12, GM-CSF, angiostatin, endostatin and 
oncostatin. In one embodiment, the method requires combining an effective amount 
of the therapeutic agent with an effective amount of cationic lipids. Cationic lipids 
useful in the methods of this invention include, but are not limited to, DDAB, 
dimethyldioctadecyl ammonium bromide; DMRIE: N-[l-(2,3- 
30 dimyristyloxy)propyl]-N,N-dimethyl-N-(2-hydroxyethyl) ammonium bromide; 
DMTAP: l,2-dimyristoyl-3-trimethylammonium propane; DOGS: 
Dioctadecylamidoglycylspermine; DOTAP (same as DOTMA): N-(l-(2,3- 
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dioleoyloxy)propyl)-N,N J N-Mmethylammoniiim chloride; DPTAP: 1,2- 
dipalmitoyl-3-trimethylammonium propane; DSTAP: l,2-disteroyl-3- 
trimethylammonium propane. 

In one aspect, a ratio of from about 30 to about 90% of phosphates contained 
5 within the negatively charged therapeutic agent are neutralized by positive charges 
on lipid molecules (negative charges are in excess) to form an electrostatic micelle 
complex in an effective concentration of ethanol. In one aspect, the ethanol solution 
is from about 20% to about 80% ethanol. In a further aspect, the ethanol 
concentration is about 30%. The ethanol/cationic lipid/therapeutic agent complex is 
10 then combined with an effective amount of a fusogenic-karyophilic peptide 

conjugate. In one aspect, an effective amount of the conjugate is a ratio range from 
about 0.0 to about 0.3 (positive charges on peptide to negative charges on phosphate 
groups) to neutralize the majority of the remaining negative charges on the 
phosphate groups of the therapeutic agents thereby leading to an almost complete 
15 neutralization of the complex. The optimal conditions give to the complex a slightly 
negative charge. However, when the positive charges on cationic lipids exceed the 
negative charges on the DNA, the excess of positive charges are neutralized by 
DPPG (dipalmitoyl phosphatidyl glycerol) and its derivatives, or by other anionic 
lipid molecules in the final micelle complex. 
20 In an alternative embodiment, the above methods can be modified by 

addition of DNA condensing agents selected from spermine, spermidine, and 
magnesium or other divalent metal ions neutralizing a certain percentage (1-20%) of 
phosphate groups. 

In a further embodiment, the cationic lipids are combined with an effective 
25 amount of fusogenic lipid DOPE at various molar ratios for example, in a molar 
ratio of from about 1 : 1 cationic lipid:DOPE. In an alternative embodiment, the 
cationic lipids are combined with an effective amount of a fusogenic/NLS peptide 
conjugate. Examples of fusogenic/NLS peptide conjugates include, but are not 
limited to (KAWLKAF) 3 (SEQ ID NO:l), GLFKAAAKLLKSLWKLLLKA (SEQ 
30 ID NO:2), LLLKAFAKLLKSLWKLLLKA (SEQ ID NO:3), as well as all 
derivatives of the prototype (Hydrophobic3-Karyophilicl-Hydrophobic2- 
Karyophilicl) 2 -3 where Hydrophobic is any of the A, I, L, V, P, G, W, F and 
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Karyophilic is any of the K, R, or H, containing a positively-charged residue every 
3rd or 4th amino acid, which form alpha helices and direct a net positive charge to 
the same direction of the helix. Additional examples include but are not limited to 
GLFKAIAGFIKNGWKGMIDGGGYC (SEQ ID NO:4) from influenza virus 
5 hemagglutinin HA-2; YGRKKRRQRRR (SEQ ID NO:5) from TAT of HIV; 

MSGTFGGILAGLIGLL(K/R/H)i _6 (SEQ ID NO:6), derived from the N-terminal 
region of the S protein of duck hepatitis B virus, but with the addition of one to six 
positively-charged lysine, arginine or histidine residues, and combinations of these, 
able to interact directly with the phosphate groups of plasmid or oligonucleotide 

10 DNA, compensating for part of the positive charges provided by the cationic lipids. 
GAAIGLAWIPYFGPAA (SEQ ID NO:7) is derived from the fusogenic peptide of 
the Ebola virus transmembrane protein; residues 53-70 (C-terminal helix) of 
apolipoprotein (apo) All peptide; the 23-residue fusogenic N-terminal peptide of 
HIV-1 transmembrane glycoprotein gp41; the 29-42 -residue fragment from 

1 5 Alzheimer's P-amyloid peptide; the fusion peptide and N-terminal heptad repeat of 
Sendai virus; the 56-68 helical segment of lecithin cholesterol acyltransferase. 
Included within these embodiments are shorter versions of these peptides, that are 
known to induce fusion of unilamellar lipid vesicles or all that are similarly 
derivatized with the addition of one to six positively-charged lysine, arginine or 

20 histidine residues (K/R/H)i. 6 able to interact directly with the phosphate groups of 
plasmid or oligonucleotide DNA, compensating for part of the positive charges 
provided by the cationic lipids. The fusogenic peptides in the fusogenic/NLS 
conjugates represent hydrophobic amino acid stretches, and smaller fragments of 
these peptide sequences, that include all signal peptide sequences used in membrane 

25 or secreted proteins that insert into the endoplasmic reticulum. Alternatively, the 
conjugates represent transmembrane domains and smaller fragments of these 
peptide sequences. 

In one aspect of the invention, the NLS peptide component in 
fusogenic/NLS peptide conjugates is derived from the fusogenic hydrophobic 

30 peptides. However, there is an addition of 5-6 amino acid karyophilic Nuclear 

Localization Signals (NLS) derived from a number of known NLS peptides, as well 
as from searches of the nuclear protein databases, for stretches of five or more 
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karyophilic amino acid stretches in proteins containing at least four positively- 
charged amino aids flanked by a proline (P) or glycine (G). Examples of NLS 
peptides are shown in Tables 1-8. The NLS peptide component in fusogenic/NLS 
peptide conjugates are synthetic peptides containing the above said NLS, but further 
5 modified by additional K, R, H residues at the central part of the peptide or with P 
or G at the N- or C-terminus. 

In a further aspect, the fusogenic/NLS peptide conjugates are derived from 
the said fusogenic hydrophobic peptides but with the addition of a stretch of H4-6 
(four to six histidine residues) in the place of NLS. Micelle formation takes place at 

10 pH 5-6 where histidyl residues are positively charged but lose their charge at the 
nearly neutral pH of the biological fluids, thus releasing the plasmid or 
oligonucleotide DNA from their electrostatic interaction. 

The fusogenic peptide/NLS peptide conjugates are linked to each other with 
a short amino acid stretch representing an endogenous protease cleavage site. 

15 In a preferred aspect of the invention, the structure of the preferred prototype 

fusogenic/NLS peptide conjugate used in this invention is: PKKRRGPSP(L/A/I)i2- 
20 (SEQ ID NO:8), where (L/A/I)i2-2o is a stretch of 12-20 hydrophobic amino acids 
containing A, L, I, Y, W, F and other hydrophobic amino acids. 

The micelles made by the above methods are further provided by this 

20 invention by conversion into liposomes. An effective amount of liposomes 

(diameter from about 80 to about 160 nm), or of a lipid solution composed of 
cholesterol (from about 10% to about 50%), neutral phospholipid such as 
hydrogenated soy phosphatidylcholine (HSPC) (from about 40% to about 90%), and 
the derivatized vesicle-forming lipid PEG-DSPE (distearoylphosphatidyl 

25 ethanolamine) from about 1-to about 7 mole percent, is added to the micelle 
solution. 

In a specific embodiment, the liposomes are composed of vesicle-forming 
lipids and between from about 1 to about 7 mole percent of distearoylphosphatidyl 
ethanolamine (DSPE) derivatized with a polyethyleneglycol. The composition of 
30 claim 20, wherein the polyethyleneglycol has a molecular weight is between about 
1,000 to 5,000 daltons. Micelles are converted into liposomes with a concomitant 
decrease of the ethanol concentration which can be accomplished by removal of the 
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ethanol by dialysis of the liposome complexes through permeable membranes or 
reduced to a diameter of 80-1 60 nm by extrusion through membranes. 

Liposome encapsulated therapeutic agents produced by the above methods 
are further provided by this invention. 
5 Also provided herein is a method for delivering a therapeutic agent such as 

plasmid DNA or oligonucleotides to a tissue cell in vivo by intravenous, or other 
type of injection of the micelles or liposomes. This method specifically targets a 
primary tumor and the metastases by the long circulating time of the micelle or 
liposome complex because of the exposure of PEG chains on its surface, its small 

10 size (80-160 nm) and the decrease in hydrostatic pressure in the solid tumor from 
the center to its periphery supporting a preferential extravasation through the tumor 
vasculature to the extracellular space in tumors. A method for delivering plasmid or 
oligonucleotide DNA across the cell membrane barrier of the tumors using the 
micelle or liposome complexes described herein is capable because of the presence 

15 of the fusogenic peptides in the complex. In particular, a method for delivering 
plasmid or oligonucleotide DNA to the liver, spleen and bone marrow after 
intravenous injection of the complexes is provided. Further provided is a method 
for delivering therapeutic genes to the liver, spleen and bone marrow of cancer and 
noncancer patients including but not limited to, factor VIII or IX for the therapy of 

20 hemophilias, multidrug resistance, cytokine genes for cancer immunotherapy, genes 
for the alleviation of pain, genes for the alleviation of diabetes and genes that can be 
introduced to liver, spleen and bone marrow tissue, to produce a secreted form of a 
therapeutic protein. 

The disclosed therapies also provide methods for reducing tumor size by 

25 combining the encapsulated plasmid DNA carrying one or more anticancer genes 
selected from the group consisting of p53, RB, BRCA1, E1A, bcl-2, MDR-1, p21, 
pl6, bax, bcl-xs, E2F, IGF-I VEGF, angiostatin, oncostatin, endostatin, GM-CSF, 
IL-12, IL-2, IL-4, IL-7, IFN-y, TNF-a, HSV-tk (in combination with ganciclovir), 
E. coli cytosine deaminase (in combination with 5-fluorocytosine) with 

30 encapsulated antisense oligonucleotides (antisense c-fos, c-myc, K-ras), ribozymes 
or triplex-forming oligonucleotides directed against genes that control the cell cycle 
or signaling pathways. These methods can be modified by combining the 
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encapsulated plasmid DNA carrying one or more anticancer genes of with 
encapsulated or free antineoplastic drugs, consisting of the group of adriamycin, 
angiostatin, azathioprine, bleomycin, busulfane, camptothecin, carboplatin, 
carmustine, chlorambucile, chlormethamine, chloroquinoxaline sulfonamide, 
5 cisplatin, cyclophosphamide, cycloplatam, cytarabine, dacarbazine, dactinomycin, 
daxmorubicin, didox, doxorubicin, endostatin, enloplatin, estramustine, etoposide, 
extramustinephosphat, flucytosine, fluorodeoxyuridine, fluorouracil, gallium nitrate, 
hydroxyurea, idoxuridine, interferons, interleukins, leuprolide, lobaplatin, 
lomustine, mannomustine, mechlorethamine, mechlorethaminoxide, melphalan, 

10 mercaptopurine, methotrexate, mithramycin, mitobronitole, mitomycin, 

mycophenolic acid, nocodazole, oncostatin, oxaliplatin, paclitaxel, pentamustine, 
platinum-triamine complex, plicamycin, prednisolone, prednisone, procarbazine, 
protein kinase C inhibitors, puromycine, semustine, signal transduction inhibitors, 
spiroplatin, streptozotocine, stromelysin inhibitors, taxol, tegafur, telomerase 

15 inhibitors, teniposide, thalidomide, thiamiprine, thioguanine, thiotepa, tiamiprine, 
tretamine, triaziquone, trifosfamide, tyrosine kinase inhibitors, uramustine, 
vidarabine, vinblastine, vinca alcaloids, vincristine, vindesine, vorozole, zeniplatin, 
zeniplatin, and zinostatin. 

The following examples are intended to illustrate, but not limit the invention. 

20 

Liposome Composition 

Liposomes are microscopic vesicles consisting of concentric lipid bilayers. 
Structurally, liposomes range in size and shape from long tubes to spheres, with 
dimensions from a few hundred Angstroms to fractions of a millimeter. Vesicle- 

25 forming lipids are selected to achieve a specified degree of fluidity or rigidity of the 
final complex providing the lipid composition of the outer layer. These are neutral 
(cholesterol) or bipolar and include phospholipids, such as phosphatidylcholine (PC), 
phosphatidylethanolamine (PE), phosphatidylinositol (PI), and sphingomyelin (SM) 
and other type of bipolar lipids including but not limited to 

30 dioleoylphosphatidylethanolamine (DOPE), with a hydrocarbon chain length in the 
range of 14-22, and saturated or with one or more double C=C bonds. Examples of 
lipids capable of producing a stable liposome, alone, or in combination with other 
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lipid components are phospholipids, such as hydrogenated soy phosphatidylcholine 
(HSPC), lecithin, phosphatidylethanolamine, lysolecithin, 
lysophosphatidylethanolamine, phosphatidylserine, phosphatidylinositol, 
sphingomyelin, cephalin, cardiolipin, phosphatidic acid, cerebrosides, 
5 distearoylphosphatidylethanolamine (DSPE), dioleoylphosphatidylcholine (DOPC), 
dipalmitoylphosphatidylcholine (DPPC), palmitoyloleoylphosphatidylcholine 
(POPC), palmitoyloleoylphosphatidylethanolamine (POPE) and 
dioleoylphosphatidylethanolamine 4-(N-maleimido-methyl)cyclohexane- 1 - 
carboxylate (DOPE-mal). Additional non-phosphorous containing lipids that can 

10 become incorporated into liposomes include stearylamine, dodecylamine, 

hexadecylamine, isopropyl myristate, triethanolamine-lauryl sulfate, alkyl-aryl 
sulfate, acetyl palmitate, glycerol ricinoleate, hexadecyl stereate, amphoteric acrylic 
polymers, polyethyloxylated fatty acid amides, and the cationic lipids mentioned 
above (DDAB, DODAC, DMRIE, DMTAP, DOGS, DOTAP (DOTMA), DOSPA, 

15 DPTAP, DSTAP, DC-Choi). Negatively charged lipids include phosphatidic acid 
(PA), dipalmitoylphosphatidylglycerol (DPPG), dioleoylphosphatidylglycerol and 
(DOPG), dicetylphosphate that are able to form vesicles. Preferred lipids for use in 
the present invention are cholesterol, hydrogenated soy phosphatidylcholine (HSPC) 
and, the derivatized vesicle-forming lipid PEG-DSPE. 

20 Typically, liposomes can be divided into three categories based on their 

overall size and the nature of the lamellar structure. The three classifications, as 
developed by the New York Academy Sciences Meeting, "Liposomes and Their Use 
in Biology and Medicine," December 1977, are multi-lamellar vesicles (MLVs), 
small uni-lamellar vesicles (SUVs) and large uni-lamellar vesicles (LUVs). 

25 SUVs range in diameter from approximately 20 to 50 nm arid consist of a 

single lipid bilayer surrounding an aqueous compartment. Unilamellar vesicles can 
also be prepared in sizes from about 50 nm to 600 nm in diameter. While 
unilamellar are single compartmental vesicles of fairly uniform size, MLVs vary 
greatly in size up to 10,000 nm, or thereabouts, are multi -compartmental in their 

30 structure and contain more than one bilayer. LUV liposomes are so named because 
of their large diameter that ranges from about 600 nm to 30,000 nm; they can contain 
more than one bilayer. 
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Liposomes may be prepared by a number of methods not all of which 
produce the three different types of liposomes. For example, ultrasonic dispersion 
by means of immersing a metal probe directly into a suspension of MLVs is a 
common way for preparing SUVs. 
5 Preparing liposomes of the MLV class usually involves dissolving the lipids 

in an appropriate organic solvent and then removing the solvent under a gas or air 
stream. This leaves behind a thin film of dry lipid on the surface of the container. 
An aqueous solution is then introduced into the container with shaking, in order to 
free lipid material from the sides of the container. This process disperses the lipid, 

10 causing it to form into lipid aggregates or liposomes. Liposomes of the LUV variety 
may be made by slow hydration of a thin layer of lipid with distilled water or an 
aqueous solution of some sort. Alternatively, liposomes may be prepared by 
lyophilization. This process comprises drying a solution of lipids to a film under a 
stream of nitrogen. This film is then dissolved in a volatile solvent, frozen, and 

15 placed on a lyophilization apparatus to remove the solvent. To prepare a 

pharmaceutical formulation containing a drug, a solution of the drug is added to the 
lyophilized lipids, whereupon liposomes are formed. 

Preparing Cationic Liposome/Cationic Peptide/Nucleic Acid Micelles 

20 Cationic lipids, with the exception of sphingosine and some lipids in 

primitive life forms, do not occur in nature. The present invention uses single-chain 
amphiphiles which are chloride and bromide salts of the alkyltrimethylammonium 
surfactants including but not limited to C12 and CI 6 chains abbreviated DDAB 
(same as DODAB) or CTAB. The molecular geometry of these molecules 

25 determines the critical micelle concentration (ratio between free monomers in 
solution and molecules in micelles). Lipid exchange between the two states is a 
highly dynamic process; phospholipids have critical micelle concentration values 
below 10" 8 M and are more stable in liposomes; however, single chain detergents, 
such as stearylamine, may emerge from the liposome membrane upon dilution or 

30 intravenous injection in milliseconds (Lasic, 1997). 

Cationic lipids include, but are not limited to, DDAB: dimethyldioctadecyl 
ammonium bromide (same as N,N-distearyl-N,N-dimetfiylammoni\im bromide); 
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DMRIE: N-[ 1 -(23-dimyristyloxy)propyl]-N 9 N-dimethyl-N-(2-hydroxyethyl) 
ammonium bromide; DODAC: N,N-dioleyl-N,N-dimethylammonium chloride; 
DMTAP: 1 ,2-dimyristoyl-3-trimethylammoniiim propane; DODAP: l,2-dioleoyl-3- 
dimethylammonium propane; DOGS: Dioctadecylamidoglycylspermine; DOTAP 
5 (same as DOTMA): N-(l-(2 5 3-dioleoyloxy)propyl)-N,N,N-trimethylammonium 

chloride; DOSP A: N-( 1 -(2,3-dioleyloxy)propyl)-N-(2-(sperminecarboxamido)ethyl)- 
N,N-dimethyl ammonivim trifluoroacetate; DPTAP: 1,2- dipalmitoyl-3- 
trimethylammonium propane; DSTAP: l,2-disteroyl-3-trimethylammonium propane; 
DC-Choi, 3P-(N-(N , s N , -dimethylaminoethane)carbamoyl)cholesterol. 

10 Lipid-based vectors used in gene transfer have been formulated in one of two 

ways. In one method, the nucleic acid is introduced into preformed liposomes made 
of mixtures of cationic lipids and neutral lipids. The complexes thus formed have 
undefined and complicated structures and the transfection efficiency is severely 
reduced by the presence of serum. Preformed liposomes are commercially available 

15 as LEPOFECTIN and LIPOFECT AMINE. The second method involves the 
formation of DNA complexes with mono- or poly-cationic lipids without the 
presence of a neutral lipid. These complexes are prepared in the presence of ethanol 
and are not stable in water. Additionally, these complexes are adversely affected by 
serum (see, Behr, Acc. Chem. Res. 26:274-78 (1993)). An example of a 

20 commercially available poly-cationic lipid is TRANSFECTAM. Other efforts to 
encapsulate DNA in lipid-based formulations have not overcome these problems 
(see, Szoka et al., Ann. Rev. Biophys. Bioeng. 9:467 (1980); and Deamer, U.S. Patent 
No. 4,515,736). 

The nucleotide polymers can be single-stranded DNA or RNA, or double- 
25 stranded DNA or DNA-RNA hybrids. Examples of double-stranded DNA include 
structural genes, genes including control and termination regions, and self- 
replicating systems such as plasmid DNA. Particularly preferred nucleic acids are 
plasmids. Single-stranded nucleic acids include antisense oligonucleotides 
(complementary to DNA and RNA), ribozymes and triplex-forming 
30 oligonucleotides. In order to increase stability, some single-stranded nucleic acids 
will preferably have some or all of the nucleotide linkages substituted with stable, 
non-phosphodiester linkages, including, for example, phosphorothioate, 
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phosphorodithioate, phosphoroselenate, methylphosphonate, or O-alkyl 
phosphotriester linkages. 

Encapsulating Cationic Liposome/Cationic Peptide/Nucleic Acid 
5 Micelles into Neutral Liposomes 

Cationic lipids used with fusogenic peptide/NLS conjugates to provide the 
inner layer of the particle can be any of a number of substances selected from the 
group of DDAB, DODAC, DMRIE, DMTAP, DOGS, DOTAP (DOTMA), DOSPA, 
DPTAP, DSTAP, DC-Choi. The cationic lipid is combined with DOPE. In one 

10 group of embodiments, the preferred cationic lipid is DDABrDOPE 1:1. 

Neutral lipids used herein to provide the outer layer of the particles can be 
any of a number of lipid species that exist either in an uncharged or neutral 
zwitterionic form at physiological pH. Such lipids are selected from a group 
consisting of diacylphosphatidylcholine, diacylphosphatidylethanolamine, ceramide, 

15 sphingomyelin, cephalin, and cerebrosides. In one group of embodiments, lipids 
containing saturated, mono-, or di-unsaturated fatty acids with carbon chain lengths 
in the range of CI 4 to C22 are preferred. In general, less saturated lipids are more 
easily sized, particularly when the liposomes must be sized below about 0.16 
microns, for purposes of filter sterilization. Consideration of liposome size, rigidity 

20 and stability of the liposomes in the final preparation, its shelf life without leakage of 
the encapsulated DNA, and stability in the bloodstream generally guide the selection 
of neutral lipids for providing the outer coating of our gene vehicles. Lipids having 
a variety of acyl chain groups of varying chain length and degree of saturation are 
available or may be isolated or synthesized by well-known techniques. In another 

25 group of embodiments, lipids with carbon chain lengths in the range of C14 to C22 
are used. Preferably, the neutral lipids used in the present invention are 
hydrogenated soy phosphatidylcholine (HSPC), cholesterol, and PEG- 
distearoylphosphatidyl ethanolamine (DSPE) or PEG-ceramide. 

30 Methods for preparing liposomes 

A variety of methods for preparing various liposome forms have been 
described in several issued patents, for example, U.S. Patent Nos. 4,229,360; 
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4,224,179; 4,241,046; 4,737,323; 4,078,052; 4,235,871; 4,501,728; and 4,837,028, 
as well as in the articles Szoka et al., Ann. Rev. Biophys. Bioeng. 9:467 (1980) and 
Hope et al., Chem. Phys. Lip. 40:89 (1986). These methods do not produce all three 
different types of liposomes (MLVs, SUVs, LUVs). For example, ultrasonic 
5 dispersion by means of immersing a metal probe directly into a suspension of MLVs 
is a common way for preparing SUVs. 

Preparing liposomes of the MLV class usually involves dissolving the lipids 
in an appropriate organic solvent and then removing the solvent under a gas or air 
stream. This leaves behind a thin film of dry lipid on the surface of the container. 

10 An aqueous solution is then introduced into the container with shaking, in order to 
free lipid material from the sides of the container. This process disperses the lipid, 
causing it to form into lipid aggregates or liposomes. Liposomes of the LUV variety 
may be made by slow hydration of a thin layer of lipid with distilled water or an 
aqueous solution of some sort. Alternatively, liposomes may be prepared by 

15 lyophilization. This process comprises drying a solution of lipids to a film under a 
stream of nitrogen. The film is then dissolved in a volatile solvent, frozen, and 
placed on a lyophilization apparatus to remove the solvent. To prepare a 
pharmaceutical formulation containing a drug, a solution of the drug is added to the 
lyophilized lipids, whereupon liposomes are formed. 

20 Following liposome preparation, the liposomes may be sized to achieve a 

desired size range and relatively narrow distribution of liposome sizes. Preferably, 
the preformed liposomes are sized to a mean diameter of about 80 to 160 nm (the 
upper size limit for filter sterilization before in vivo administration). Several 
techniques are available for sizing liposomes to a desired size. Sonicating a 

25 liposome suspension either by bath or probe sonication produces a progressive size 
reduction down to small unilamellar vesicles less than about 0.05 microns (50 nm) in 
size. Extrusion of liposome through a small-pore polycarbonate is our preferred 
method for reducing liposome sizes to a relatively well-defined size distribution. 
The liposomes may be extruded through successively smaller-pore membranes, to 

30 achieve a gradual reduction in liposome size. 

One way used to coat DNA with lipid is by controlled detergent depletion 
from a cationic lipid/DNA/detergent complex. This method can give complexes 
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with stability in plasma. Hofland et al. (1996), have prepared such complexes by 
dialysis of a mixture of DOSPA/DOPE/DNA/octylglucoside. 

Pharmaceutical compositions comprising the cationic liposome/nucleic acid 
complexes of the invention are prepared according to standard techniques and further 
5 comprise a pharmaceutical^ acceptable carrier. Generally, normal saline will be 
employed as the pharmaceutically acceptable carrier. 

For in vivo administration, the pharmaceutical compositions are preferably 
administered parenterally, i.e., intravenously, intraperitoneal^, subcutaneously, 
intrathecally, injection to the spinal cord, intramuscularly, intraarticularly, portal 

10 vein injection, or intratumorally. More preferably, the pharmaceutical compositions 
are administered intravenously or intratumorally by a bolus injection. In other 
methods, the pharmaceutical preparations may be contacted with the target tissue by 
direct application of the preparation to the tissue. The application may be made by 
topical "open" or "closed" procedures. The term "topical" means the direct 

15 application of the pharmaceutical preparation to a tissue exposed to the environment, 
such as the skin, to any surface of the body, nasopharynx, external auditory canal, 
ocular administration and administration to the surface of any body cavities, 
inhalation to the lung, genital mucosa and the like. 

"Open" procedures are those procedures that include incising the skin of a 

20 patient and directly visualizing the underlying tissue to which the pharmaceutical 
preparations are applied. This is generally accomplished by a surgical procedure, 
such as a thoracotomy to access the lungs, abdominal laparotomy to access 
abdominal viscera, or other direct surgical approach to the target tissue. 

"Closed" procedures are invasive procedures in which the internal target 

25 tissues are not directly visualized, but accessed via insertion of instruments through 
small wounds in the skin. For example, the preparations may be administered to the 
peritoneum by needle lavage. Likewise, the pharmaceutical preparations may be 
administered to the meninges or spinal cord by infusion during a lumbar puncture 
followed by appropriate positioning of the patient as commonly practiced for spinal 

30 anesthesia or metrazamide imaging of the spinal cord. Alternatively, the 
preparations may be administered through endoscopic devices. 
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EXAMPLES 

Materials and Methods 

DDAB, DOPE (dioleoylphosphatidylethanolamine) and most other lipids 
used here were purchased from Avanti Polar Lipids; PEG-DSPE was from Syngena. 

5 

Engineering of plasmid pLF 

The pGL3-C (Promega) was cut with Xbal and blunt-end ligated using the 
Klenow fragment of E. coli DNA polymerase. It was then cut with Hindlll and the 
1689-bp fragment, carrying the luciferase gene, was gel-purified. The pGFP-Nl 

10 plasmid (Clontech) was cut with Smal and Hindlll and the 4.7 kb fragment, isolated 
from an agarose gel, was ligated with the luciferase fragment. JM109 E. coli cells 
were transformed and 20 colonies were selected; about half of them showed the 
presence of inserts; 8 clones with inserts were cut with BamHI and Xhol to further 
confirm the presence of the luciferase gene; seven of them were positive. 

15 Radiolabeled plasmid pLF was generated by culturing Escherichia coli in 

3 H-thymidine-5-triphosphate or 32 P inorganic phosphate (5 mCi) (Dupont/NEN, 
Boston, Mass.) and purified using standard techniques as described above. 

DLS measurements 

20 A Coulter N4M light scattering instrument was used, at a 90° angle, set at a 

run time of 200 sec, using 4 to 25 microsec sample time. The scan of the particle 
size distribution was obtained in 1 ml sample volume using plastic cuvettes, at 20°C 
and at 0.01 poise viscosity. 

In one aspect, this invention provides a method for entrapping DNA into 

25 lipids that enhances the content of plasmid per volume unit, and reduces the toxicity 
of the cationic lipids used to trap plasmid or oligonucleotide DNA. The DNA 
becomes hidden in the inner membrane bilayer of the final complex. Furthermore, 
the gene transfer complex is endowed with long circulation time in body fluids and 
extravasates preferentially into solid tumors and their metastatic foci and nodules. 

30 The extravasation occurs through their vasculature at most sites of the human or 
animal body after intravenous injection of the gene-carrying vehicles. This occurs 
because of their small size (100-160 nm), their content in neutral to slightly 

21 



WO 01/93836 PCT/US01/18657 

negatively-charged lipids in their outer membrane bilayers, and their coating with 
PEG. These gene delivery vehicles are able to cross the cell membrane barrier after 
they reach the extracellular tumor space because of the presence of fusogenic 
peptides conjugated with karyophilic peptides. The vehicles assume a certain 
5 predefined orientation in the lipid membrane with their positive ends directed toward 
DNA and their hydrophobic tail buried inside the hydrophobic lipid bilayer. The 
labile NLS-fusogenic peptide linkage is cleaved after endocytosis and the remaining 
NLS peptide bound to plasmid DNA aids its nuclear uptake. This occurs especially 
when non-dividing cells are targeted, such as liver, spleen or bone marrow cells that 
10 represent the major sites for extravasation and concentration of these vehicles other 
than solid tumors. 

Organic solvent 

A suitable solvent for preparing a micelle from the desired lipid components 
15 is ethanol, methanol, or other aliphatic alcohols such as propanol, isopropanol, 

butanol, tert-butanol, iso-butanol, pentanol and hexanol. Mixtures of two or more 
solvents may be used in the practice of the invention. It is also to be understood that 
any solvent that is miscible with an ethanol solution, even in small amounts, can be 
used to improve micelle formation and its subsequent conversion into liposomes, 
20 including chloroform, dichloromethane, diethylether, cyclohexane, cyclopentane, 
benzene, and toluene. 

Cationic lipids 

In a further embodiment, the liposome encapsulated DNA described herein 
25 further comprises an effective amount of cationic lipids. Cationic lipids have been 
widely used for gene transfer; a number of clinical trials (34 out of 220 total RAC- 
approved protocols as of December, 1997) use cationic lipids. Although many cell 
culture studies have been documented, systemic delivery of genes with cationic 
lipids in vivo has been very limited. All clinical protocols use subcutaneous, 
30 intradermal, intratumoral, and intracranial injection as well as intranasal, 

intrapleural, or aerosol administration but not LV. delivery, because of the toxicity of 
the cationic lipids and DOPE (see, Martin and Boulikas, 1998). Liposomes 
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formulated from DOPE and cationic lipids based on diacyltrimethylammonium 
propane (dioleoyl-, dimyristoyl-, dipalmitoyK disteroyl-trimethylammonium 
propane or DOTAP, DMTAP, DPTAP, DSTAP, respectively) or DDAB were highly 
toxic when incubated in vitro with phagocytic cells (macrophages and U937 cells), 
5 but not towards non-phagocytic T lymphocytes. The rank order of toxicity was 
DOPE/DDAB > DOPE/DOTAP > DOPE/DMT AP > DOPE/DPTAP > 
DOPE/DSTAP; and the toxicity was determined from the effect of the cationic 
liposomes on the synthesis of nitric oxide (NO) and TNF-a produced by activated 
macrophages (Filion and Phillips, 1997). 

10 Another aspect to be considered before LV. injection is undertaken, is that" 

negatively charged serum proteins can interact and cause inactivation of cationic 
liposomes (Yang and Huang, 1997). Condensing agents used for plasmid delivery 
including polylysine, transferrin-polylysine, a fifth-generation poly(amidoamine) 
(PAMAM) dendrimer, poly(ethyleneimine), and several cationic lipids (DOTAP, 

15 DOChol/DOPE, DOGS/DOPE, and DOTMA/DOPE), were found to activate the 

complement system to varying extents. Strong complement activation was seen with 
long-chain polylysines, the dendrimer, poly(ethyleneimine), and DOGS. Modifying 
the surface of preformed DNA complexes with polyethyleneglycol (Plank et al., 
1996) considerably reduced complement activation. 

20 Cationic lipids increase the transfection efficiency by destabilizing the 

biological membranes, including plasma, endosomal, and lysosomal membranes. 
Incubation of isolated lysosomes with low concentrations of DOTAP caused a 
striking increase in free activity of P-galactosidase, and even a release of the enzyme 
into the medium. This demonstrates that the lysosomal membrane is deeply 

25 destabilized by the lipid. The mechanism of destabilization was thought to involve 
an interaction between cationic liposomes and anionic lipids of the lysosomal 
membrane, thus allowing a fusion between the lipid bilayers. The process was less 
pronounced at pH 5 than at pH 7.4, and anionic amphipathic lipids were able to 
prevent partially this membrane destabilization (Wattiaux et al., 1997). 

30 In contrast to DOTAP and DMRIE that were 100% charged at pH 7.4, DC- 

CHOL was only about 50% charged as monitored by a pH-sensitive fluorophore. 
This difference decreases the charge on the external surfaces of the liposomes, and 
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was proposed to promote an easier dissociation of bilayers containing DC-CHOL 
from the plasmid DNA, and an increase in release of the DNA-lipid complex into the 
cytosol from the endosomes (Zuidam and Barenholz, 1997). 

Although cationic lipids have been used widely for the delivery of genes, 
5 very few studies have used systemic LV. injection of cationic liposome-plasmid 
complexes. This is because of the toxicity of the lipid component in animal models, 
not humans. Administration by LV, injection of two types of cationic lipids of 
similar structure, DOTMA and DOTAP, shows that the transfection efficiency is 
determined mainly by the structure of the cationic lipid and the ratio of cationic lipid 

10 to DNA; the luciferase and GFP gene expression in different organs was transient, 
with a peak level between 4 and 24 hr, dropping to less than 1% of the peak level by 
day 4 (Song et al., 1997). 

A number of different organs in vivo can be targeted after liposomal delivery 
of genes or oligonucleotides. Intravenous injection of cationic liposome-plasmid 

15 complexes by tail vein in mice, targeted mainly the lung and to a smaller extent the 
liver, spleen, heart, kidney and other organs (Zhu et al., 1993). Intraperitoneal 
injection of a plasmid-liposome complex expressing antisense K-ras RNA in nude 
mice inoculated i.p. with AsPC-1 pancreatic cancer cells harboring K-ras point 
mutations and PCR analysis indicated that the injected DNA was delivered to 

20 various organs except brain (Aoki et al., 1995). 

A number of factors for DOTAP:cholesterol/DNA complex preparation 
including the DNA: liposome ratio, mild sonication, heating, and extrusion were 
found to be crucial for improved systemic delivery; maximal gene expression was 
obtained when a homogeneous population of DNAiliposome complexes between 

25 200 to 450 nm in size were used. Cryo-electron microscopy showed that the DNA 
was condensed on the interior of invaginated liposomes between two lipid bilayers in 
these formulations, a factor that was thought to be responsible for the high 
transfection efficiency in vivo and for the broad tissue distribution (Templeton et al., 
1997). 

30 Steps to improve liposome-mediated gene delivery to somatic cells include, 

persistence of the plasmid in blood circulation, port of entry and transport across the 
cell membrane, release from endosomal compartments into the cytoplasm, nuclear 
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import by docking through the pore complexes of the nuclear envelope, expression 
driven by the appropriate promoter/enhancer control elements, and persistence of the 
plasmid in the nucleus for long periods (Boulikas, 1998a). 

5 Plasmid condensation with spermine 

In a further embodiment, the liposome encapsulated DNA described herein is 
condensed with spermine and/or spermidine. DNA can be presented to cells in 
culture as a complex with polycations such as polylysine, or basic proteins such as 
protamine, total histones or specific histone fractions, protamine (Boulikas and 

10 Martin, 1997). The interaction of plasmid DNA with protamine sulfate, followed by 
the addition of DOTAP cationic liposomes, offered a better protection of plasmid 
DNA against enzymatic digestion. The method gave consistently higher gene 
expression in mice via tail vein injection as compared with DOTAP/DNA 
complexes. 50 jig of luciferase-plasmid per mouse gave 20 ng luciferase protein per 

15 mg extracted tissue protein in the lung, that was detected as early as 1 h after 
injection, peaked at 6 h and declined thereafter. Intraportal injection of 
protamine/DOTAP/DNA led to about a 100-fold decrease in gene expression in the 
lung as compared with LV. injection. Endothelial cells were the primary locus of 
lacZ transgene expression (Li and Huang, 1997). Protamine sulfate enhanced 

20 plasmid delivery into several different types of cells in vitro, using the monovalent 
cationic liposomal formulations (DC-Choi and lipofectin). This effect was less 
pronounced with the multivalent cationic liposome formulation, lipofectamine (Sorgi 
etaL, 1997). 

Spermine is found to enhance the transfection efficiency of DNA-cationic 
25 liposome complexes in cell culture and in animal studies. This biogenic polyamine 
at high concentrations caused liposome fusion most likely promoted by the 
simultaneous interaction of one molecule of spermine (four positively charged amino 
groups) with the polar head groups of two or more molecules of lipids. At low 
concentrations (0.03-0.1 mM) it promoted anchorage of the liposome-DNA complex 
30 to the surface of cells and enhanced significantly transfection efficiency (Boulikas, 
unpublished). 
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The polycations polybrene, protamine, DEAE-dextran, and poly-L-lysine 
significantly increased the efficiency of adeno virus-mediated gene transfer in cell 
culture. This was thought to act by neutralizing the negative charges presented by 
membrane glycoproteins that reduce the efficiency of adenovirus-mediated gene 
5 transfer (Arcasoy et al., 1997). 

Oligonucleotide transfer 

In a further embodiment, the liposome encapsulates oligonucleotide DNA. 
Encapsulation of oligonucleotides into liposomes increased their therapeutic index, 

10 prevented degradation in cultured cells, and in human serum and reduced toxicity to 
cells (Thierry and Dritschilo, 1992; Capaccioli et al., 1993; Lewis et al., 1996). 
However, most studies have been performed in cell culture, and very few in animals 
in vivo. There are still an important number of improvements needed before these 
approaches can move into clinical studies. 

15 Zelphati and Szoka (1997), have found that complexes of fluorescently 

labeled oligonucleotides with DOTAP liposomes, entered the cell using an endocytic 
pathway mainly involving uncoated vesicles. Oligonucleotides were redistributed 
from punctate cytoplasmic regions into the nucleus. This process was independent 
of acidification of the endosomal vesicles. The nuclear uptake of oligonucleotides 

20 depended on several factors, such as charge of the particle, where positively charged 
complexes were required for enhanced nuclear uptake. DOTAP increased over 100 
fold the antisense activity of a specific anti-luciferase oligonucleotide. 
Physicochemical studies of oligonucleotide-liposome complexes of different cationic 
lipid compositions indicated that either phosphatidylethanolamine or negative 

25 charges on other lipids in the cell membrane are required for efficient fusion with 
cationic liposome-oligonucleotide complexes to promote entry to the cell 
(Jaaskelainen et al., 1994). 

Similar results were reported by Lappalainen et al. (1997). Digoxigenin- 
labeled oligodeoxynucleotides (ODNs) complexed with the polycationic DOSPA 

30 and the monocationic DDAB (with DOPE as a helper lipid) were taken up by CaSki 
cells in culture by endocytosis. The nuclear membrane was found to pose a barrier 
against nuclear import of ODNs that accumulated in the perinuclear area. Although 
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DOSPA/DOPE liposomes could deliver ODNs into the cytosol, they were unable to 
mediate nuclear import of ODNs. On the contrary, oligonucleotide-DDAB/DOPE 
complexes with a net positive charge were released from vesicles into the cytoplasm. 
It was determined that DDAB/DOPE mediated nuclear import of the 
5 oligonucleotides. 

DOPE-heme (ferric protoporphyrin IX) conjugates, inserted in cationic lipid 
particles with DOTAP, protected oligoribonucleotides from degradation in human 
serum and increased oligoribonucleotide uptake into 2.2.15 human hepatoma cells. 
The enhancing effect of heme was evident only at a net negative charge in the 

10 particles (Takle et al., 1997). Uptake of liposomes labeled with 1 1 and composed 
of DC-Choi and DOPE was primarily by liver, with some accumulation in spleen 
and skin and very little in the lung after I.V. tail injection. Preincubation of cationic 
liposomes with phosphorothioate oligonucleotide induced a dramatic, yet transient, 
accumulation of the lipid in lung that gradually redistributed to liver. The 

15 mechanism of lung uptake involved entrapment of large aggregates of 

oligonucleotides within pulmonary capillaries at 15 min post-injection via embolism. 
Labeled oligonucleotide was localized primarily to phagocytic vacuoles of Kupffer 
cells at 24 h post-injection. Nuclear uptake of oligonucleotides in vivo was not 
observed (Litzinger et al., 1996). 

20 

Polyethylene glycol (PEG)-coated liposomes 

In a further embodiment, the liposome encapsulated DNA described herein, 
further comprise coating of the final complex in step 2 (Fig. 1) with PEG. It is often 
desirable to conjugate a lipid to a polymer that confers extended half-life, such as 

25 polyethylene glycol (PEG). Derivatized lipids that are employed, include PEG- 

modified DSPE or PEG-ceramide. Addition of PEG components prevents complex 
aggregation, increases circulation lifetime of particles (liposomes, proteins, other 
complexes, drugs) and increases the delivery of lipid-nucleic acid complexes to the 
target tissues. See, Maxfield et al., Polymer 76:505-509 (1975); Bailey, F.E. et al., 

30 in: Nonionic Surfactants, Schick, M.J., ed., pp. 794-821 (1967); Abuchowski, A. et 
al., 7. Biol Chem. 252:3582-3586 (1977); Abuchowski, A. et al., Cancer Biochem. 
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Biophys. 7:175-186 (1984); Katre, N.V. et al., Proc. Natl. Acad. Set USA 54:1487- 
1491 (1987); Goodson, R. et al. Bio Technology 5:343-346 (1990). 

Conjugation to PEG is reported to have reduced immunogenicity and 
toxicity. See, Abuchowski et al.,7. Biol. Chem. 252:3578-3581 (1977). The extent 
5 of enhancement of blood circulation time of liposomes, by coating with PEG is 
described in U.S. Patent No. 5,013,556. Typically, the concentration of the PEG- 
modified phospholipids, or PEG-ceramide in the complex will be about 1-7%. In a 
particularly preferred embodiment, the PEG-modified lipid is a PEG-DSPE. 

Coating the surface of liposomes with inert materials designed to camouflage 

10 the liposome from the body's host defense systems was shown to increase 

remarkably the plasma longevity of liposomes. The biological paradigm for this 
"surface modified" sub-branch was the erythrocyte, a cell that is coated with a dense 
layer of carbohydrate groups, and that manages to evade immune system detection 
and to circulate for several months (before being removed by the same type of cell 

1 5 responsible for removing liposomes). 

The first breakthrough came in 1987 when a glycolipid (the brain tissue- 
derived ganglioside GM1), was identified that, when incorporated within the lipid 
matrix, allowed liposomes to circulate for many hours in the blood stream (Allen and 
Chonn, 1987). A second glycolipid, phosphatidylinositol, was also found to impart 

20 long plasma residence times to liposomes and, since it was extracted from soybeans, 
not brain tissue, was believed to be a more pharmaceutical^ acceptable excipient 
(Gabizon et aL, 1989). 

A major advance in the surface-modified sub-branch was the development of 
polymer-coated liposomes (Allen et al. 1991). Polyethylene glycol (PEG) 

25 modification had been used for many years to prolong the half-lives of biological 
proteins (such as enzymes and growth factors) and to reduce their immunogenicity 
{e.g. Beauchamp et al., 1983). It was reported in the early 1990s that PEG-coated 
liposomes circulated for remarkably long times after intravenous administration. 
Half-lives on the order of 24 h were seen in mice and rats, and over 30 hours in dogs. 

30 The term "stealth" was applied to these liposomes because of their ability of evade 
interception by the immune system. The PEG hydrophilic polymers form dense 
"conformational clouds" to prevent other macromolecules from interaction with the 
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surface, even at low concentrations of the protecting polymer (Gabizon and 
Papahadjopoulos, 1988; Papahadjopoulos et al., 1991; reviewed by Torchilin, 1998). 
The increased hydrophilicity of the liposomes after their coating with the 
amphipathic PEG5000 leads to a reduction in nonspecific uptake by the 
5 reticuloendothelial system. 

Whereas the half-life of antimyosin immunoliposomes was 40 min, by 
coating with PEG, they increased their half- life to 1000 min after intravenous 
injection to rabbits (Torchilin et al., 1992). 

10 Micelles, surfactants and small unilamellar vesicles 

In a further embodiment, the liposome encapsulated DNA described herein, 
further comprise an initial step of micelle formation between cationic lipids and 
condensed plasmid or oligonucleotide DNA in ethanol solutions. Micelles are small 
amphiphilic colloidal particles formed by certain kinds of lipid molecules, detergents 

15 or surfactants under defined conditions of concentration, solvent and temperature. 
They are composed of a single lipid layer. Micelles can have their hydrophilic head 
groups assembled exposing their hydrophobic tails to the solvent (for example in 30- 
60% aqueous ethanol solution) or can reverse their structures exposing their polar 
heads toward the solvent such as by lowering the concentration of the ethanol to 

20 below 10% (reverse micelles). Micelle systems are in thermodynamic equilibrium 
with the solvent molecules and environment. This results in constant phase changes, 
especially upon contact with biological materials, such as upon introduction to cell 
culture, injection to animals, dilution, contact with proteins or other macromolecules. 
These changes result in rapid micelle disassembly or flocculation. This is in contrast 

25 to the much higher stability of liposome bilayers. 

Single-chain surfactants are able to form micelles (see Table 1, below). 
These include the anionic (sodium dodecyl sulfate, cholate or oleate) or cationic 
(cetyl-trimethylammonium bromide, CTAB) surfactants. CTAB, CTAC, and DOIC 
micelles yielded larger solubility gaps (lower concentration of colloidally suspended 

30 DNA) than corresponding SUV particles containing neutral lipid and CTAB (1:1) 
(Lasic, 1997). 
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Table 1: Molecules able to form micelles 



Molecule 


Reference 


CTAB, CTAC, DOIC 


Lasic, 1997 


Detergent/phospholipid micelles 


Lusa et al.,1998 


Dodecyl betaine (amphoteric surfactant) 


de la Maza et al, 1998 


Dodecylphosphocholine cholate 


Lasic, 1997 


Glycine-conjugated bile salt (anionic steroid detergent-like molecule) 


Leonard and Cohen, 1998 


Lipid-dodecyl maltoside micelles 


Lambert et al., 1998 


mixed micelles (Triton X-100 & phosphatidylcholine) 


Lopez et al., 1998 


Octylglucoside (non-ionic straight chain detergent) 


Leonard and Cohen, 1998 


Oleate 


Lasic, 1997 


PEG- dialkylphosphatidic acid (dihexadecylphosphatidyl (DHP)- 
PEG2000) 


Tirosh etal., 1998 


Phosphatidylcholine (neutral zwitterionic) 


Schroeder etal., 1990 


Polyethyleneglycol (MW 5000)-distearoyl phosphatidyl ethanolamine 
(PEG-DSPE) 


Weissig etal., 1998 


sodium dodecyl sulfate (anionic straight chain detergent) 


Leonard and Cohen, 1998 


Sodium taurofusidate (conjugated fungal bile salt analog) 


Leonard and Cohen, 1998 1 


Taurine- conjugated bile salts (anionic steroid detergent-like 
molecule) 


Leonard and Cohen, 1998 


Triton X-100 surfactant 


Lasic, 1997 



There is a critical detergent/phospholipid ratio at which lamellar-to-micellar 
5 transition occurs. For example, the vesicle-micelle transition was observed for 
dodecyl maltoside with large unilamellar liposomes. A striking feature of the 
solubilization process by dodecyl maltoside was the discovery of a new phase, 
consisting of a very viscous "gel-like" structure composed of long filamentous 
thread-like micelles, over 1 to 2 microns in length. 

10 A long circulating complex needs to be slightly anionic. Therefore the 

liposomes used for the conversion of the micelles into liposomes contain bipolar 
lipids (PC, PE) and 1-30% negatively charged lipids (DPPG). The cationic lipids 
which are toxic, are hidden in the inner liposome membrane bilayer. Those reaching 
the solid tumor will exert their toxic effects causing apoptosis. Apoptosis will be 

15 caused by the delivery of the toxic drug or anti-neoplastic gene or oligonucleotide to 
the cancer cell but also by the nuclear localization of the cationic lipids (along with 
plasmid DNA) to the nucleus. Indeed, a number of studies suggest that plasmid 
DNA is imported to nuclei; its translocation docks cationic lipid molecules 
electrostatically attached to the DNA. These cationic lipid molecules exert their 

20 toxicity by interfering with the nucleosome and domain structure of the chromatin 
causing local destabilization. This disturbance or aberrant chromatin reorganization 
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could be exerted at the level of the nuclear matrix where plasmid DNA is attached 
for transcription, autonomous replication, or integration via recombination. 

Surfactants have found wide application in formulations such as emulsions 
(including microemulsions) and liposomes. The most common way of classifying 
5 and ranking the properties of the many different types of surfactants, both natural 
and synthetic, is by the use of the hydrophile/lipophile balance (HLB). The use of 
surfactants in drug products, formulations and in emulsions has been reviewed 
(Rieger, in: Pharmaceutical Dosage Forms, Marcel Dekker, Inc., New York, 1988, 
p. 285). 

10 Nonionic surfactants find wide application in pharmaceutical and cosmetic 

products and are usable over a wide range of pH values. In general, their HLB 
values range from 2 to about 18, depending on their structure. Nonionic surfactants 
include, nonionic esters such as ethylene glycol esters, propylene glycol esters, 
glyceryl esters, polyglyceryl esters, sorbitan esters, sucrose esters, and ethoxylated 

15 esters. Nonionic alkanolamides and ethers, such as fatty alcohol ethoxylates, 
propoxylated alcohols, and ethoxylated/propoxylated, block polymers are also 
included in this class. The polyoxyethylene surfactants are the most popular 
members of the nonionic surfactant class. 

Anionic surfactants include carboxylates such as soaps, acyl lactylates, acyl 

20 amides of amino acids, esters of sulfuric acid such as alkyl sulfates and ethoxylated 
alkyl sulfates, sulfonates such as alkyl benzene sulfonates, acyl isethionates, acyl 
taurates and sulfosuccinates, and phosphates. The most important members of the 
anionic surfactant class are the alkyl sulfates and the soaps. 

Cationic surfactants include quaternary ammonium salts and ethoxylated 

25 amines. The quaternary ammonium salts are the most used members of this class. If 
the surfactant molecule has the ability to carry either a positive or negative charge, 
the surfactant is classified as amphoteric. Amphoteric surfactants include acrylic 
acid derivatives, substituted alkylamides, N-alkylbetaines and phosphatides. 
Classical micelles may not be effective as gene transfer vehicles, but 

30 important intermediates in the formation of liposome complexes encapsulating drugs 
or nucleic acids. The stability of single chain surfactants-DNA-colloidal systems is 
lower than SUV particles containing neutral lipid and CTAB (1:1). However, 
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second generation micelles are able to target tumors in vivo. Weissig and co- 
workers (1998) used the soybean trypsin inhibitor (STI) as a model protein to target 
tumors. STI was modified with a hydrophobic residue of N-glutaryl-phosphatidyl- 
ethanolamine (NGPE) and incorporated into both polyethyleneglycol (MW 5000)- 
5 distearoyl phosphatidyl ethanolamine (PEG-DSPE) micelles (< 20 nm) and PEG- 
DSPE-modified long-circulating liposomes (ca. 100 nm). As determined from the 
protein label by using lll In attached to soybean trypsin inhibitor via protein-attached 
diethylene triamine pentaacetic acid, DTP A, PEG-lipid micelles accumulated better 
than the same protein anchored in long-circulating PEG-liposomes in subcutaneously 

10 established Lewis lung carcinoma in mice after tail vein injection. 

Loading a liposomal dispersion with an amphiphilic drug may cause a phase 
transformation into a micellar solution. The transition from high ratios of 
phospholipid to drug (from 2:1 to 1:1 downwards) were accompanied by the 
conversion of liposomal dispersions of milky-white appearance (particle size 200 

15 nm) to nearly transparent micelles (particle size below 25 nm). See, Schutze and 
Muller-Goymann (1998). 

Fusogenic peptides 

In a further embodiment, the liposome encapsulated DNA described herein 
20 further comprises an effective amount of a fusogenic peptide. Fusogenic peptides 
belong to a class of helical amphipathic peptides characterized by a hydrophobicity 
gradient along the long helical axis. This hydrophobicity gradient causes the tilted 
insertion of the peptides in membranes, thus destabilizing the lipid core and, thereby, 
enhancing membrane fusion (Decout et al., 1999). 
25 Hemagglutinin (HA) is a homotrimeric surface glycoprotein of the influenza 

virus. In infection, it induces membrane fusion between viral and endosomal 
membranes at low pH. Each monomer consists of the receptor-binding HA1 domain 
and the membrane-interacting HA2 domain. The NIfe-terminal region of the HA2 
domain (amino acids 1 to 127), the so-called "fusion peptide," inserts into the target 
30 membrane and plays a crucial role in triggering fusion between the viral and 

endosomal membranes. Based on the substitution of eight amino acids in region 5- 
14 with cysteines and spin-labeling electron paramagnetic resonance, it was 
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concluded that the peptide forms an alpha-helix tilted approximately 25 degrees from 
the horizontal plane of the membrane with a maximum depth of 15 A from the 
phosphate group (Macosko et al., 1997). Use of fusogenic peptides from influenza 
virus hemagglutinin HA-2 enhanced greatly the efficiency of transferrin-polylysine- 
5 DNA complex uptake by cells. The peptide was linked to polylysine and the 

complex was delivered by the transferrin receptor-mediated endocytosis (reviewed 
by Boulikas, 1998a). This peptide has the sequence: GLFEAIAGFI 
ENGWEGMIDG GGYC (SEQ ID NO:9) and is able to induce the release of the 
fluorescent dye calcein from liposomes prepared with egg yolk phosphatidylcholine, 

10 which was higher at acidic pH. This peptide was also able to increase up to 10- fold 
the anti-HIV potency of antisense oligonucleotides, at a concentration of 0.1-1 mM, 
using CEM-SS lymphocytes in culture. This peptide changes conformation at the 
slightly more acidic environment of the endosome, destabilizing and breaking the 
endosomal membrane (reviewed by Boulikas, 1998a). 

15 The presence of negatively charged lipids in the membrane is important for 

the manifestation of the fusogenic properties of some peptides, but not of others. 
Whereas the fusogenic action of a peptide, representing a putative fusion domain of 
fertilin, a sperm surface protein involved in sperm-egg fusion, was dependent upon 
the presence of negatively charged lipids, that of the HIV2 peptide was not (Martin 

20 and Ruysschaert, 1997). 

For example, to analyze the two domains on the fusogenic peptides of 
influenza virus hemagglutinin HA, HA-chimeras were designed in which the 
cytoplasmic tail and/or transmembrane domain of HA was replaced with the 
corresponding domains of the fusogenic glycoprotein F of Sendai virus. Constructs 

25 of HA were made in which the cytoplasmic tail was replaced by peptides of human 
neurofibromin type 1 (NF1) (residues 1441 to 1518) or c-Raf-1, (residues 51 to 131) 
and were expressed in CV-1 cells by using the vaccinia virus-T7 polymerase 
transient-expression system. Membrane fusion between CV-1 cells and bound 
human erythrocytes (RBCs) mediated by parental or chimeric HA proteins showed 

30 that, after the pH was lowered, a flow of the aqueous fluorophore calcein from 

preloaded RBCs into the cytoplasm of the protein-expressing CV-1 cells took place. 
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This indicated that membrane fusion involves both leaflets of the lipid bilayers and 
leads to formation of an aqueous fusion pore (Schroth-Diaz et al., 1998). 

A remarkable discovery was that the TAT protein of HIV is able to cross cell 
membranes (Green and Loewenstein, 1998) and that a 36-amino acid domain of 
5 TAT, when chemically cross-linked to heterologous proteins, conferred the ability to 
transduce into cells. The 1 1 -amino acid fusogenic peptide of TAT 
(YGRKKRJRQRRR (SEQ ID NO: 10)) is a nucleolar localization signal (see 
Boulikas, 1998b). 

Another protein of HIV, the glycoprotein gp41, contains fusogenic peptides. 

1 0 Linear peptides derived from the membrane proximal region of the gp41 ectodomain 
have potential applications as anti-HIV agents and inhibit infectivity by adopting a 
helical conformation (Judice et al., 1997). The 23 amino acid residue, N-terminal 
peptide of HIV-1 gp41 has the capacity to destabilize negatively charged large 
unilamellar vesicles. In the absence of cations, the main structure was a pore- 

15 forming alpha-helix, whereas in the presence of Ca 2+ the conformation switched to a 
fusogenic, predominantly extended beta-type structure. The fusion activity of 
HlV(ala) (bearing the R22-»A substitution) was reduced by 70%, whereas 
fusogenicity was completely abolished when a second substitution (V2->E) was 
included, arguing that it is not an alpha-helical but an extended structure adopted by 

20 the HIV-1 fusion peptide that actively destabilizes cholesterol-containing, 
electrically neutral membranes (Pereira et al., 1997). 

The prion protein (PrP) is a glycoprotein of unknown function normally 
found at the surface of neurons and of glial cells. It is involved in diseases such as 
bovine spongiform encephalopathy, and Creutzfeldt-Jakob disease in humans, where 

25 PrP is converted into an altered form (termed PrPSc). According to computer 

modeling calculations, the 120 to 133 and 118 to 135 domains of PrP are tilted lipid- 
associating peptides inserting in a oblique way into a lipid bilayer and able to 
interact with liposomes to induce leakage of encapsulated calcein (Pillot et al., 
1997b). 

30 The C-terminal fragments of the Alzheimer amyloid peptide (amino acids 29- 

40 and 29-42) have properties related to those of the fusion peptides of viral proteins 
inducing fusion of liposomes in vitro. These properties could mediate a direct 
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interaction of the amyloid peptide with cell membranes and account for part of the 
cytotoxicity of the amyloid peptide. In view of the epidemiologic and biochemical 
linkages between the pathology of Alzheimer's disease and apolipoprotein E (apoE) 
polymorphism, examination of the potential interaction between the three common 
5 apoE isoforms and the C-terminal fragments of the amyloid peptide showed that 
only apoE2 and apoE3, not apoE4, are potent inhibitors of the amyloid peptide 
fusogenic and aggregational properties. The protective effect of apoE against the 
formation of amyloid aggregates was thought to be mediated by the formation of 
stable apoE/amyloid peptide complexes (Pillot et al., 1997a; Lins et al., 1999). 

10 The fusogenic properties of an amphipathic net-negative peptide (WAE 1 1), 

consisting of 1 1 amino acid residues were strongly promoted when the peptide was 
anchored to a liposomal membrane. The fusion activity of the peptide appeared to 
be independent of pH and membrane merging, and the target membranes required a 
positive charge that was provided by incorporating lysine-coupled 

15 phosphatidylethanolamine (PE-K). Whereas the coupled peptide could cause vesicle 
aggregation via nonspecific electrostatic interaction with PE-K, the free peptide 
failed to induce aggregation of PE-K vesicles (Pecheur et al., 1997). 

A number of studies suggest that stabilization of an alpha-helical secondary 
structure of the peptide after insertion in lipid bilayers in membranes of cells or 

20 liposomes is responsible for the membrane fusion properties of peptides. Zn 2+ , 
enhances the fusogenic activity of peptides because it stabilizes the alpha-helical 
structure. For example, the HEXXH (SEQ ID NO:l 1) domain of the salivary 
antimicrobial peptide, located in the C-terminal functional domain of histatin-5, a 
recognized zinc-binding motif is in a helicoidal conformation (Martin et al., 1999; 

25 Melino et al., 1999; Curtain et al., 1999). 

Fusion peptides have been formulated with DNA plasmids to create peptide- 
based gene delivery systems. A combination of the YKAKnWK (SEQ ID NO: 12) 
peptide, used to condense plasmids into 40 to 200 nm nanoparticles, with the 
GLFEALLELLESLWELLLEA (SEQ ID NO:13) amphipathic peptide, that is a pH- 

30 sensitive lytic agent designed to facilitate release of the plasmid from endosomes 
enhanced expression systems containing the beta-galactosidase reporter gene 
(Duguid et al., 1998). See Table 2, below. 
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Table 2. Fusogenic peptides 



Fusogenic peptide 


Source Protein 


jt roperiieb 


tVclCl elite 


GLFEAIAGFDENGWEG 
MIDGGGYC (SEQ ID 
JMu:yj 


Influenza virus 
hemagglutinin 

TT A O 


Endowed with membrane 
fusion properties 


Bongartz et al., 1994 


YGRKKRRQRRR (SEQ 
ID NO:5) 


TAT of HIV 


Endowed with membrane 
fusion properties 


Green and 
Loewenstein, 1988 


23 -residue fusogenic N- 
terminal peptide 


HIV-1 trans- 
membrane 
glycoprotein gp41 


Was able to insert as an 
alpha-helix into neutral 
phospholipid bilayers 


Curtain et al., 1999 


70 residue peptide (SV- 
117) 


Fusion peptide 
and N-terrninal 
heptad repeat of 
Sendai virus 


Induced lipid mixing of egg 
phosphatidylcholine- 
phosphatidyiglycerol 
(PC/PG) large unilamellar 
vesicles (LUVs) 


Ghosh and Shai, 
1999 


23 hydrophobic amino 
acids in the ammo-tenninal 
region 


S protein of 
hepatitis B virus 
(HBV) 


A high degree of similarity 
with known fusogenic 
peptides from other viruses. 


Rodriguez-Crespo et 
al, 1994 


MSGTFGGILAGLIGLL 
(SEQ ED NO:6) 


N-terrninal region 
of the S protein of 
duck hepatitis B 
Virus (DHBV) 


Was inserted into the 
hydrophobic core of the 
lipid bilayer and induced 
leakage of internal aqueous 
contents from both neutral 
and negatively charged 
liposomes 


Rodriguez-Crespo et 
al, 1999 


MSPSSLLGLLAGLQW 
(SEQ ID NO: 14) 


S protein of 
woodchuck 
hepatitis B virus 
(WHV) 


Was inserted into the 
hydrophobic core of the 
lipid bilayer and induced 
leakage of internal aqueous 
contents from both neutral 
and negatively charged 
liposomes 


Rodriguez-Crespo et- 
al., 1999 


N-terminus of Nef 


Nef protein of 
human 

immunodeficienc 
y type 1 (HIV-1) 


Membrane-perturbing and 
fusogenic activities in 
artificial membranes; causes 
cell killing in E. coli and 
yeast 


Macreadie et al, 
1997 


Arruno-terminal sequence 
Fl polypeptide 


Fl polypeptide of 
measles virus 
(MV) 


Can be used as a carrier 
system for CTL epitopes 


Partidos etal, 1996 


19-27 amino acid segment 


Glycoprotein 

£>P-' * ui UO VUlt 

leukemia virus 


Adopts an amphiphilic 

ctn ■ fti irp n r»rf «lnv^ A VfPV 
bliul-lUiC a(lu JJlajra a- ^cjr 

role in the fusion events 
induced by bovine leukemia 
virus 


Voneche et al, 1992 


120 to 133 and 118 to 135 
domains 


Prion protein 


Tilted lipid-associating 
peptide; interact with 
liposomes to induce leakage 
of encapsulated calcein 


Pillotetal, 1997b 


29-42-residue fragment 


Alzheimer's beta- 
amyloid peptide 


Endowed with capacities 
resembling those of the 
tilted fragment of viral 
fusion proteins 


Lins et al, 1999 


Non-aggregated amyloid 
beta-peptide (1-40) 


Alzheimer's beta- 
amyloid peptide 


Induces apoptotic neuronal 
cell death 


Pillotetal, 1999 
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Fusogenic peptide 


Source Protein 


Properties 


Reference 


LCAT 56-68 helical 
segment 


Lecithin 
cholesterol 
acyltransferase 
(LCAT) 


Forms stable beta-sheets in 
lipids 


Peelmanet al. t 1999; 
Decout et al., 1999 


Peptide sequence B18 


Membrane- 
associated sea 
urchin sperm 
protein binding 


Triggers fusion between 
lipid vesicles; a histidine- 
rich motif for binding zinc 
is required for the fusogenic 
function 


Ulrich etal., 1999 


53-70 (C-terminal helix) 


Apolipoprotein 
(apo) All 


Induces fusion of 
unilamellar lipid vesicles 
and displaces apo AI from 
HDL and r-HDL 


Lambert et al., 1998 


Residues 90-111 


PH-30alpha(a 
protein 
functioning in 
sperm-egg fusion) 


Membrane-fusogenic 
activity to acidic 
phospholipid bilayers 


Niidome etal., 1997 


Casein signal peptides 


Alpha s2- and 
beta-casein 


Interact with 

dimyristoylphosphatidyl- 
glycerol and -choline 
liposomes; show both lytic 
and fusogenic activities 


Creuzenetet al., 1997 


Pardaxin 


Amphipathic 
polypeptide, 
purified from the 
gland secretion of 
the Red Sea 
Moses sole 
flatfish 
Pardachirus 
marmoratus 


Forms voltage-gated, 
cation-selective pores; 
mediated the aggregation of 
liposomes composed of 
phosphatidylserine but not 
of phosphatidylcholine 


Lelkes and 
Lazarovici, 1988 


Histatin-5 


Salivary 

antimicrobial 

peptide 


Aggregates and fuses 
negatively charged small 
unilamellar vesicles in the 
presence of Zn2+ 


Melino et aL, 1999 


Gramicidin (linear 
hydrophobic polypeptide) 


Antibiotic 


Induces aggregation and 
fusion of vesicles 


Massari and Colonna, 
1986; Toumoiset 
aL, 1990 


Amphipathic negatively 
charged peptide consisting 
of 11 residues (WAE) 


Synthetic 


Forms an alpha-helix 
inserted and anchored into 
the membrane (favored at 
37oC) oriented almost 
parallel to the lipid acyl 
chains; promotes fusion of 
large unilamellar liposomes 
(LUV) 


Martin et al., 1999 


A polymer of polylysine 
(average 190) partially 
substituted with histidyl 
residues 


Synthetic 


Histidyl residues become 
cationic upon protonation of 
the imidazole groups at pH 
below 6.0.; disrupt 
endosomal membranes 


Midoux and 
Monsigny, 1999 


GLFEALLELLESLWELL 
LEA (SEQIDNO:4) 


Synthetic 


Amphipathic peptide; a pH- 
sensitive lytic agent to 
facilitate release of the 
plasmid from endosomes 


Duguidetal., 1998 
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Fusogenic peptide 


Source Protein 


Properties 


Reference 


(LKKL) 4 (SEQ ID NO: 15) 


Synthetic 


Amphiphilic fusogenic 
peptide, able to interact with 
four molecules of DMPC 


Gupta and Kothekar, 
1997 


Ac-(Leu-Ala-Arg-Leu) 3 - 
NHCH 3 (SEQ ID NO: 16) 


Synthetic; basic 

amphipathic 

peptides 


Caused a leakage of 
contents from small 
unilamellar vesicles 
composed of egg yolk 
phosphatidylcholine and egg 
yolk phosphatide acid (3:1) 


SuenagaetaL, 1989; 
Lee etal., 1992 


Amphiphilic anionic 
peptides E5 and E5L 


Synthetic 


Can mimic the fusogenic 
activity of influenza 
hemagglutinin (HA) 


Murata etal., 1991 


30-amino acid peptide with 
the major repeat unit Glu- 
Ala-Leu-Ala (GALA) 7 
(SEQ ID NO: 17) 


Synthetic; 
designed to mimic 
the behavior of 
the fusogenic 
sequences of viral 
fusion proteins 


Becomes an amphipathic 
alpha-helix as the pH is 
lowered to 5.0 ; fusion of 
phosphatidylcholine small 
unilamellar vesicles induced 
by GALA requires a peptide 
length greater than 16 amino 
acids 


Parente etal., 1988 


Poly Glu-Aib-Leu-Aib 
(SEQ ID NO: 18) Aib 
represents 2- 
aminoisobutyric acid 


Synthetic 


Amphiphilic structure upon 
the formation of alpha- 
helix; caused fusion of 
EYPC liposomes and 
dipalmitoylphosphatidylchol 
ine liposomes more strongly 
with decreasing pH 


Kono etal., 1993 



Fusogenic lipids 

DOPE is a fusogenic lipid; elastase cleavage of N-methoxy-succinyl-Ala- 
5 Ala-Pro-Val-DOPE (SEQ ID NO: 1 9) converted this derivative to DOPE (overall 
positive charge) to deliver an encapsulated fluorescent probe, calcein, into the cell 
cytoplasm (Pak et al., 1999). An oligodeoxynucleic sequence of 30 bases 
complementary to a region of beta-endorphin mRNA elicited a concentration- 
dependent inhibition of beta-endorphin production in cell culture after it was 
10 encapsulated within small unilamellar vesicles (50 nm) containing dipalmitoyl-DL- 
alpha-phosphatidyl-L-serine endowed with fusogenic properties (Fresta et al, 1998). 

Nuclear localization signals (NLS) 

In a further embodiment, the liposome encapsulated plasmid or 
15 oligonucleotide DNA described herein further comprise an effective amount of 

nuclear localization signal (NLS) peptides. Trafficking of nuclear proteins from the 
site of their synthesis in the cytoplasm to the sites of function in the nucleus through 
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pore complexes is mediated by NLSs on proteins to be imported into nuclei (Tables 
3-10, below). Protein translocation from the cytoplasm to the nucleoplasm involves: 
(i) the formation of a complex of karyopherin a with NLS-protein; (ii) subsequent 
binding of karyopherin P; (iii) binding of the complex to FXFG peptide repeats on 
5 nucleoporins; (iv) docking of Ran-GDP to nucleoporin and to karyopherin 
heterodimer by plO; (v) a number of association-dissociation reactions on 
nucleoporins that dock the import substrate toward the nucleoplasms side with a 
concomitant GDP-GTP exchange reaction transforming Ran-GDP into Ran-GTP and 
catalyzed by karyopherin a; and (vi) dissociation from karyopherin P and release of 

10 the karyopherin a/NLS-protein by Ran-GTP to the nucleoplasm. 

Karyophilic and acidic clusters were found in most non-membrane 
serine/threonine protein kinases whose primary structure has been examined (Table 
6). These karyophilic clusters might mediate the anchoring of the kinase molecules 
to transporter proteins for their regulated nuclear import and might constitute the 

15 nuclear localization signals. In contrast to protein transcription factors that are 

exclusively nuclear possessing strong karyophilic peptides composed of at least four 
arginines, (R), and lysines, (K), within an hexapeptide flanked by proline and 
glycine helix-breakers, protein kinases often contain one histidine and three K+R 
residues (Boulikas, 1996). This was proposed to specify a weak NLS structure 

20 resulting in the nuclear import of a fraction of the total cytoplasmic kinase 

molecules, as well as in their weak retention in the different ionic strength nuclear 
environment. Putative NLS peptides in protein kinases may also contain 
hydrophobic or bulky aromatic amino acids proposed to further diminish their 
capacity to act as strong NLS. 

25 Most mammalian proteins that participate in DNA repair pathways seem to 

possess strong karyophilic clusters containing at least four R+K over a stretch of six 
amino acids (Table 7). 

Rules to predict nuclear localization of an unknown protein 

30 Several simple rules have been proposed for the prediction of the nuclear 

localization of a protein of an unknown function from its amino acid sequence: 
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(i) An NLS is defined as four arginines (R) plus lysines (K) within an 
hexapeptide; the presence of one or more histidines (H) in the tetrad of the 
karyophilic hexapeptide, often found in protein kinases that have a cytoplasmic and a 
nuclear function, may specify a weak NLS whose function might be regulated by 

5 phosphorylation or may specify proteins that function in both the cytoplasm and the 
nucleus (Boulikas, 1996); 

(ii) The K/R clusters are flanked by the a-helix breakers G and P thus placing 
the NLS at a helix-tum-helix or end of a a-helix. Negatively-charged amino acids 
(D, E) are often found at the flank of the NLS and on some occasions may interrupt 

10 the positively-charged NLS cluster; 

(iii) Bulky amino acids (W, F, Y) are not present within the NLS 
hexapeptide; 

(iv) NLS signals may not be flanked by long stretches of hydrophobic amino 
acids (e.g. five); a mixture of charged and hydrophobic amino acids serves as a 

15 mitochondrial targeting signal; 

(v) The higher the number of NLSs, the more readily a molecule is imported 
to the nucleus (Dworetzky et al., 1988). Even small proteins, for example histones 
(10-22 kDa), need to be actively imported to increase their import rates compared 
with the slow rate of diffusion of small molecules through pores; 

20 (vi) Signal peptides are stronger determinants than NLSs for protein 

trafficking. Signal peptides direct proteins to the lumen of the endoplasmic 
reticulum for their secretion or insertion into cellular membranes (presence of 
transmembrane domains) (Boulikas, 1994); 

(vii) Signals for the mitochondrial import of proteins (a mixture of 

25 hydrophobic and karyophilic amino acids) may antagonize nuclear import signals 
and proteins possessing both type of signals may be translocated to both 
mitochondria and nuclei; 

(viii) Strong association of a protein with large cytoplasmic structures 
(membrane proteins, intermediate filaments) make such proteins unavailable for 

30 import even though they posses NLS-like peptides (Boulikas, 1 994); 

(ix) . Transcription factors and other nuclear proteins posses a great different 
number of putative NLS stretches. Of the sixteen possible forms of putative NLS 
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structures the most abundant types are the 09x68, 909x9, 9099, and 99x9x9, where 
9 is R or K, together accounting for about 70% of all karyophilic clusters on 
transcription factors (Boulikas, 1994); 

(x) A small number of nuclear proteins seem to be void of a typical 

5 karyophilic NLS. Either non karyophilic peptides function for their nuclear import, 
as such molecules possess bipartite NLSs, or these NLS-less proteins depend 
absolutely for import on their strong complexation in the cytoplasm with a nuclear 
protein partner able to be imported (Boulikas, 1994). This mechanism may ensure a 
certain stoichiometric ratio of the two molecules in the nucleus, and might be of 
10 physiological significance; and 

(xi) A number of proteins may be imported via other mechanisms not 
dependent on classical NLS. 

A number of processes have been found to be regulated by nuclear import 
including nuclear translocation of the transcription factors NF-kB, rNFIL-6, ISGF3, 

15 SRF, c-Fos, GR as well as human cyclins A and Bl, casein kinase II, cAMP- 

dependent protein kinase II, protein kinase C, ERK1 and ERK2. Failure of cells to 
import specific proteins into nuclei can lead to carcinogenesis. For example, 
BRCA1 is mainly localized in the cytoplasm in breast and ovarian cancer cells, 
whereas in normal cells the protein is nuclear. mRNA is exported through the same 

20 route as a complex with nuclear proteins possessing nuclear export signals (NES). 
The majority of proteins with NES are RNA-binding proteins that bind to and escort 
RNAs to the cytoplasm. However, other proteins with NES function in the export of 
proteins; CRM1, that binds to the NES sequence on other proteins and interacts with 
the nuclear pore complex, is an essential mediator of the NES-dependent nuclear 

25 export of proteins in eukaryotic cells. Nuclear localization and export signals (NLS 
and NES) are found on a number of important molecules, including p53, v-Rel, the 
transcription factor NF-ATc, the c-Abl nonreceptor tyrosine kinase, and the fragile X 
syndrome mental retardation gene product. The deregulation of their normal 
import/export trafficking has important implications for human disease. Both 

30 nuclear import and export processes can be manipulated by conjugation of proteins 
with NLS or NES peptides. During gene therapy, the foreign DNA needs to enter 
nuclei for its transcription. A pathway is proposed involving the complexation of 
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plasmids and oligonucleotides with nascent nuclear proteins possessing NLSs as a 
prerequisite for their nuclear import. Covalent linkage of NLS peptides to 
oligonucleotides and plasmids or formation of complexes of plasmids with proteins 
possessing multiple NLS peptides was proposed (Boulikas, 1998b) to increase their 
5 import rates and the efficiency of gene expression. Cancer cells were predicted to 
import more efficiently foreign DNA into nuclei, compared with terminally 
differentiated cells because of their increased rates of proliferation and protein 
import. 

10 Antineoplastic drugs 

In a further embodiment, the liposome encapsulated plasmid or 
oligonucleotide DNA described herein, further comprises its use for reducing tumor 
size or restricting its growth with combination with encapsulated or free 
antineoplastic agents. Antineoplastic agents preferably are: (i) alkylating agents 
15 having the bis-(2-chloroethyl)-amine group such as chlormethine, chlorambucil, 
melphalan, uramustine, mannomustine, extramustinephosphat, 

mechlorethaminoxide, cyclophosphamide, ifosfamide, or trifosfamide; (ii) alkylating 
agents having a substituted aziridine group, for example tretamine, thiotepa, 
triaziquone, or mitomycine; (Hi) alkylating agents of the methanesulfonic ester type 

20 such as busiilfane; (iv) alkylating N-alkyl-N-nitrosourea derivatives, for example 
carmustine, lomustine, semustine, or streptozotocine; (v) alkylating agents of the 
mitobronitole, dacarbazine, or procarbazine type; (vi) complexing agents such as cis- 
platin; (vii) antimetabolites of the folic acid type, for example methotrexate; (viii) 
purine derivatives such as mercaptopurine, thioguanine, azathioprine, tiamiprine, 

25 vidarabine, or puromycine and purine nucleoside phosphorylase inhibitors; (ix) 
pyrimidine derivatives, for example fluorouracil, floxuridine, tegafur, cytarabine, 
idoxuridine, flucytosine; (x) antibiotics such as dactinomycin, daunorubicin, 
doxorubicin, mithramycin, bleomycin or etoposide; (xi) vinca alkaloids; (xii) 
inhibitors of proteins overexpressed in cancer cells such as telomerase inhibitors, 

30 glutathione inhibitors, proteasome inhibitors; (xiii) modulators or inhibitors of signal 
transduction pathways such as phosphatase inhibitors, protein kinase C inhibitors, 
casein kinase inhibitors, insulin-like growth factor- 1 receptor inhibitor, ras 
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inhibitors, ras-GAP inhibitor, protein tyrosine phosphatase inhibitors; (xiv) tumor 
angiogenesis inhibitors such as angiostatin, oncostatin, endostatin, thalidomide; (xv) 
modulators of the immune response and cytokines such as interferons, interleukins, 
TNF-alpha; (xvi) modulators of the extracellular matrix such as matrix 
5 metalloproteinase inhibitors, stromelysin inhibitors, plasminogen activator inhibitor; 
(xvii) hormone modulators for hormone-dependent cancers (breast cancer, prostate 
cancer) such as antiandrogen, estrogens; (xviii) apoptosis regulators; (xix) bFGF 
inhibitor; (xx) multiple drug resistance gene inhibitor; (xxi) monoclonal antibodies 
or antibody fragments against antigenes overexpressed in cancer cells (anti-Her2/neu 

10 for breast cancer); (xxii) anticancer genes whose expression will cause apoptosis, 
arrest the cell cycle, induce an immune response against cancer cells, inhibit tumor 
angiogenesis i.e. formation of blood vessels, tumor suppressor genes (p53, KB, 
BRCA1, E1A, bcl-2, MDR-1, p21, pl6, bax, bcl-xs, E2F, IGF-I VEGF, angiostatin, 
oncostatin, endostatin, GM-CSF, IL-12, IL-2, IL-4, IL-7, IFN-y, and TNF-a); and 

15 (xxiii) antisense oligonucleotides (antisense c-fos, c-myc, K-ras). Optionally these 
drugs are administered in combination with chlormethamine, prednisolone, 
prednisone, or procarbazine or combined with radiation therapy. Future new 
anticancer drugs added to the arsenal are expected to be ribozymes, triplex-forming 
oligonucleotides, gene inactivating oligonucleotides, a number of new genes directed 

20 against genes that control the cell proliferation or signaling pathways, and 
compounds that block signal transduction. 

Anti-cancer drugs include: acivicin, aclarubicin, acodazole hydrochloride, 
acronine, adozelesin, adriamycin, aldesleukin, altretamine, ambomycin, ametantrone 
acetate, aminoglutethimide, amsacrine, anastrozole, anthramycin, asparaginase, 

25 asperlin, azacitidine, azetepa, azotomycin, batimastat, benzodepa, bicalutamide, 
bisantrene hydrochloride, bisnafide dimesylate, bizelesin, bleomycin sulfate, 
brequinar sodium, bropirimine, busulfan, cactinomycin, calusterone, caracemide, 
carbetimer, carboplatin, carmustine, carubicin hydrochloride, carzelesin, cedefingol, 
chlorambucil, cirolemycin, cisplatin, cladribine, crisnatol mesylate, 

30 cyclophosphamide, cytarabine, dacarbazine, dactinomycin, daunorubicin 

hydrochloride, decitabine, dexormaplatin, dezaguanine, dezaguanine mesylate, 
diaziquone, docetaxel, doxorubicin, doxorubicin hydrochloride, droloxifene, 
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droloxifene citrate, dromostanolone propionate, duazomycin, edatrexate, eflomithine 
hydrochloride, elsamitrucin, enloplatin, enpromate, epipropidine, epirubicin 
hydrochloride, erbulozole, esorubicin hydrochloride, estramustine, estramustine 
phosphate sodium, etanidazole, etoposide, etoposide phosphate, etoprine, fadrozole 
5 hydrochloride, fazarabine, fenretinide, floxuridine, fludarabine phosphate, 

fluorouracil, flurocitabine, fosquidone, fostriecin sodium, gemcitabine, gemcitabine 
hydrochloride, hydroxyurea, idarubicin hydrochloride, ifosfamide, ilmofosine, 
interferon alfa-2a, interferon a-2b, interferon a-nl, interferon a-n3, interferon P-i a, 
interferon y-i b, iproplatin, irinotecan hydrochloride, lanreotide acetate, letrozole, 

10 leuprolide acetate, liarozole hydrochloride, lometrexol sodium, lomustine, 
losoxantrone hydrochloride, masoprocol, maytansine, mechlorethamine 
hydrochloride, megestrol acetate, melengestrol acetate, melphalan, menogaril, 
mercaptopurine, methotrexate, methotrexate sodium, metoprine, meturedepa, 
mitindomide, mitocarcin, mitocromin, mitogillin, mitomalcin, mitomycin, mitosper, 

15 mitotane, mitoxantrone hydrochloride, mycophenolic acid, nocodazole, 
nogalamycin, ormaplatin, oxisuran, paclitaxel, pegaspargase, peliomycin, 
pentamustine, peplomycin sulfate, perfosfamide, pipobroman, piposulfan, 
piroxantrone hydrochloride, phcamycin, plomestane, porfimer sodium, 
porfiromycin, prednimustine, prednisone, procarbazine hydrochloride, puromycin, 

20 puromycin hydrochloride, pyrazofurin, riboprine, rogletimide, safingol, safingol 
hydrochloride, semustine, simtrazene, sparfosate sodium, sparsomycin, 
spirogermanium hydrochloride, spiromustine, spiroplatin, streptonigrin, streptozocin, 
sulofenur, talisomycin, taxol, tecogalan sodium, tegafur, teloxantrone hydrochloride, 
temoporfin, teniposide, teroxirone, testolactone, thiamiprine, thioguanine, thiotepa, 

25 tiazofurin, tirapazamine, topotecan hydrochloride, toremifene citrate, trestolone 
acetate, triciribine phosphate, trimetrexate, trimetrexate glucuronate, triptorelin, 
tubulozole hydrochloride, uracil mustard, uredepa, vapreotide, verteporfin, 
vinblastine sulfate, vincristine sulfate, vindesine, vindesine sulfate, vinepidine 
sulfate, vinglycinate sulfate, vinleurosine sulfate, vinorelbine tartrate, vinrosidine 

30 sulfate, vinzolidine sulfate, vorozole, zeniplatin, zinostatin, zorubicin hydrochloride. 
Other anti-cancer drugs include: 20-epi-l,25 dihydroxyvitamin D3, 5- 
ethynyluracil, abiraterone, aclarubicin, acylfulvene, adecypenol, adozelesin, 
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aldesleukin, ALL-TK antagonists, altretamine, ambamustine, amidox, amifostine, 
aminolevulinic acid, amrubicin, amsacrine, anagrelide, anastrozole, andrographolide, 
angiogenesis inhibitors, antagonist D, antagonist G, antarelix, anti-dorsalizing 
morphogenetic protein- 1, antiandrogen, antiestrogen, antineoplaston, antisense 
5 oligonucleotides, aphidicolin glycinate, apoptosis gene modulators, apoptosis 
regulators, apurinic acid, ara-CDP-DL-PTBA, arginine deaminase, asulacrine, 
atamestane, atrimustine, axinastatin 1, axinastatin 2, axinastatin 3, azasetron, 
azatoxin, azatyrosine, baccatin HI derivatives, balanol, batimastat, BCR/ABL 
antagonists, benzochlorins, benzoylstaurosporine, beta lactam derivatives, beta- 

10 alethine, betaclamycin B, betulinic acid, bFGF inhibitor, bicalutamide, bisantrene, 
bisaziridinylspermine, bisnafide, bistratene A, bizelesin, breflate, bropirimine, 
budotitane, buthionine sulfoximine, calcipotriol, calphostin C, camptothecin 
derivatives, canarypox IL-2, capecitabine, carboxamide-amino-triazole, 
carboxyamidotriazole, CaRest M3, CARN 700, cartilage derived inhibitor, 

15 carzelesin, casein kinase inhibitors (ICOS), castanospermine, cecropin B, cetrorelix, 
chlorlns, chloroquinoxaline sulfonamide, cicaprost, cis-porphyrin, cladribine, 
clomifene analogues, clotrimazole, collismycin A, collismycin B, combretastatin A4, 
combretastatin analogue, conagenin, crambescidin 816, crisnatol, cryptophycin 8, 
cryptophycin A derivatives, curacin A, cyclopentanthraquinones, cycloplatam, 

20 cypemycin, cytarabine ocfosfate, cytolytic factor, cytostatin, dacliximab, decitabine, 
dehydrodidemnin B, deslorelin, dexifosfamide, dexrazoxane, dexverapamil, 
diaziquone, didemnin B, didox, diethylnorspermine, dihydro-5-azacytidine, 
dihydrotaxol, 9-dioxamycin, diphenyl spiromustine, docosanol, dolasetron, 
doxifluridine, droloxifene, dronabinol, duocarmycin SA, ebselen, ecomustine, 

25 edelfosine, edrecolomab, eflomithine, elemene, emitefur, epirubicin, epristeride, 
estramustine analogue, estrogen agonists, estrogen antagonists, etanidazole, 
etoposide phosphate, exemestane, fadrozole, fazarabine, fenretinide, filgrastim, 
finasteride, flavopiridol, flezelastine, fluasterone, fludarabine, fluorodaunorunicin 
hydrochloride, forfenimex, formestane, fostriecin, fotemustine, gadolinium gallium 

30 nitrate texaphyrin, galocitabine, ganirelix, gelatinase inhibitors, gemcitabine, 

glutathione inhibitors, hepsulfam, heregulin, hexamethylene bisacetamide, hypericin, 
ibandronic acid, idarubicin, idoxifene, idramantone, ilmofosine, ilomastat, 
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imidazoacridones, imiquimod, immunostimulant peptides, insulin-like growth factor- 
1 receptor. inhibitor, interferon agonists, interferons, interleukins, iobenguane, 
iododoxorubicin, ipomeanol, 4-, irinotecan, iroplact, irsogladine, isobengazole, 
isohomohalicondrin B, itasetron, jasplakinolide, kahalalide F, lamellarin-N 
5 triacetate, lanreotide, leinamycin, lenograstim, lentinan sulfate, leptolstatin, 
letrozole, leukemia inhibiting factor, leukocyte alpha interferon, 
leuprolide+estrogen+progesterone, leuprorelin, levamisole, liarozole, linear 
polyamine analogue, lipophilic disaccharide peptide, lipophilic platinum compounds, 
lissoclinamide 7, lobaplatin, lombricine, lometrexol, lonidamine, losoxantrone, 

10 lovastatin, loxoribiiie, lurtotecan, lutetium texaphyrin, lysofylline, lytic peptides, 
maitansine, mannostatin A, marimastat, masoprocol, maspin, matrilysin inhibitors, 
matrix metalloproteinase inhibitors, menogaril, merbarone, meterelin, methioninase, 
metoclopramide, MIF inhibitor, mifepristone, miltefosine, mirimostim, mismatched 
double stranded RNA, mitoguazone, mitolactol, mitomycin analogues, mitonafide, 

15 mitotoxin fibroblast growth factor-saporin, mitoxantrone, mofarotene, 
molgramostim, monoclonal antibody, human chorionic gonadotrophin, 
monophosphoryl lipid A+myobacterium cell wall sk, mopidamol, multiple drug 
resistance gene inhibitor, multiple tumor suppressor 1 -based therapy, mustard 
anticancer agent, mycaperoxide B, mycobacterial cell wall extract, myriaporone, N- 

20 acetyldinaline, N-substituted benzamides, nafarelin, nagrestip, naloxone 
+pentazocine, napavin, naphterpin, nartograstim, nedaplatin, nemorubicin, 
neridronic acid, neutral endopeptidase, nilutamide, nisamycin, nitric oxide 
modulators, nitroxide antioxidant, nitrullyn, 06-benzylguanine, octreotide, 
okicenone, oligonucleotides, onapristone, ondansetron, ondansetron, oracin, oral 

25 cytokine inducer, ormaplatin, osaterone, oxaliplatin, oxaunomycin, paclitaxel 

analogues, paclitaxel derivatives, palauamine, palmitoylrhizoxin, pamidronic acid, 
panaxytriol, panomifene, parabactin, pazelliptine, pegaspargase, peldesine, pentosan 
polysulfate sodium, pentostatin, pentrozole, perflubron, perfosfamide, perillyl 
alcohol, phenazinomycin, phenylacetate, phosphatase inhibitors, picibanil, 

30 pilocarpine hydrochloride, pirarubicin, piritrexim, placetin A, placetin B, 

plasminogen activator inhibitor, platinum complex, platinum compounds, platinum- 
triamine complex, porfimer sodium, porfiromycin, propyl bis-acridone, 
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prostaglandin J2, proteasome inhibitors, protein A-based immune modulator, protein 
kinase C inhibitor, protein kinase C inhibitors, microalgal., protein tyrosine 
phosphatase inhibitors, purine nucleoside phosphorylase inhibitors, purpurins, 
pyrazoloacridine, pyridoxylated hemoglobin polyoxyethylene conjugate, raf 
5 antagonists, raltitrexed, ramosetron, ras farnesyl protein transferase inhibitors, ras 
inhibitors, ras-GAP inhibitor, retelliptine demethylated, rhenium Re 186 etidronate, 
rhizoxin, ribozymes, RII retinamide, rogletimide, rohitukine, romurtide, roquinimex, 
rubiginone Bl, ruboxyl, safmgol, saintopin, SarCNU, sarcophytol A, sargramostim, 
Sdi 1 mimetics, semustine, senescence derived inhibitor 1, sense oligonucleotides, 

10 signal transduction inhibitors, signal transduction modulators, single chain antigen 
binding protein, sizofiran, sobuzoxane, sodium borocaptate, sodium phenylacetate, 
solverol, somatomedin binding protein, sonermin, sparfosic acid, spicamycin D, 
spiromustine, splenopentin, spongistatin 1, squalamine, stem cell inhibitor, stem-cell 
division inhibitors, stipiamide, stromelysin inhibitors, sulfinosine, superactive 

15 vasoactive intestinal peptide antagonist, suradista, suramin, swainsonine, synthetic 
glycosaminoglycans, tallimustine, tamoxifen methiodide, tauromustine, tazarotene, 
tecogalan sodium, tegafur, tellurapyrylium, telomerase inhibitors, temoporfin, 
temozolomide, teniposide, tetrachlorodecaoxide, tetrazomine, thaliblastine, 
thalidomide, thiocoraline, thrombopoietin, thrombopoietin mimetic, thymalfasin, 

20 thymopoietin receptor agonist, thymotrinan, thyroid stimulating hormone, tin ethyl 
etiopurpurin, tirapazamine, titanocene dichloride, topotecan, topsentin, toremifene, 
totipotent stem cell factor, translation inhibitors, tretinoin, triacetyluridine, 
triciribine, trimetrexate, triptorelin, tropisetron, turosteride, tyrosine kinase 
inhibitors, tyrphostins, UBC inhibitors, ubenimex, urogenital sinus-derived growth 

25 inhibitory factor, urokinase receptor antagonists, vapreotide, variolin B, velaresol, 

veramine, verdins, verteporfin, vinorelbine, vinxaltine, vitaxin, vorozole, zanoterone, 
zeniplatin, zilascorb, zinostatin stimalamer. 



pH-sensitive peptide-DNA complexes 

30 In a further embodiment of the invention, the genes in plasmid DNA are 

brought in interaction with fusogenic peptide/NLS conjugates. In a further 
embodiment the NLS moiety is a stretch of histidyl residues able to assume a net 
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positive charge at a pH of about 5 to 6 and to show a reduction or loose completely 
this charge at pH above 7. The electrostatic interaction of these positively-charged 
peptides with the negatively-charged plasmid DNA molecules, established at pH 5-6 
is weakened at physiological pH (pH-sensitive peptide-DNA complexes). 
5 The first step of the present invention involves complex formation between 

the plasmid or oligonucleotide DNA with the histidyl/fusogenic peptide conjugate 
and lipid components in 10-90% ethanol at pH 5.0 to 6.0. The conditions must be 
where the histidyl residues have a net positive charge and can establish electrostatic 
interactions with plasmids, oligonucleotides or negatively-charged drugs. At the 

10 same time, the presence of the positively-charged lipid molecules promotes 

formation of micelles. At the second step, micelles are converted into liposomes by 
dilution with water and mixing with pre-made liposomes or lipids at pH 5-6. This is 
followed by dialysis against pH 7 and extrusion through membranes, entrapping and 
encapsulating plasmids or oligonucleotides to with a very high yield. 

15 Whereas the composition of peptides and cationic lipids in the first step 

provides the lipids of the internal bilayer, the type of liposomes or lipids added at 
step 2 provide the external coating of the final liposome formulation (Figure 1). 
Examples for the formulations of peptides include: HHHHHSPSLi 6 (SEQ ID 
NO:623), and HHHHHSPS(LAI) 5 (SEQ ID NO:624). 

20 These are added at a 1 :0. 5:0.5 molar ratio (negative charge on DNA: cationic 

liposome: histidine peptide). The peptide inserts in an alpha-helical conformation 
inside the lipid bilayer and not only carries out DNA condensation but also endows 
membrane fusion properties to the complex to improve entrance across the cell 
membrane. The type of hydrophobic amino acids (for example, content in aromatic 

25 amino acids), in the peptide chain is very important as is the length of the peptide 
chain in ensuring integrity and rigidity of the complexes. Coating the outer surface 
of the complexes with polyethyleneglycol, hyaluronic acids and other polymers 
conjugated to lipids gives the particles long circulation properties in body fluids and 
the ability to target solid tumors and their metastases after intravenous injection, and 

30 also the ability to cross the tumor cell membrane. 
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Protease-sensitive linkages in peptides between the NLS and fusogenic 
moieties 

Conversion of micelles into liposomes 

An important issue of the present invention is the conversion of micelles 
5 formed between the DNA and the cationic lipids, in the presence of ethanol, into 
liposomes. This is done by the direct addition of the micelle complex into an 
aqueous solution of preformed liposomes. The liposomes have an average size of 
80-160 run or vice versa, leading to a solution of a final ethanol concentration below 
10%. A formulation suitable for pharmaceutical use and for injection into humans 
10 and animals will require that the liposomes are of neutral composition (such as 
cholesterol, PE, PC) coated with PEG. 

However, another important aspect is the research application of the present 
invention, such as for transfection of cells in culture. The composition of the 
aqueous solution of liposomes is any type of liposomes containing cationic lipids 
15 and suitable therefore for transfection of cells in culture such as DDAB:DOPE 1:1. 
These liposomes are pre-formed and downsized by sonication or extrusion through 
membranes to a diameter of 80-160 nm. The ethanolic micelle preparations are then 
added to the aqueous solution of liposomes with a concomitant dilution of the 
ethanol solution to below 10%. This step will result in further condensation of DNA 
20 or interaction of the negatively-charged phosphate groups on DNA with positively 
charged groups on lipids. Care must be taken so as only part of the negative charges 
on DNA are neutralized by lipids in the micelle. The remaining charge 
neutralization of the DNA is to be provided by the cationic component of the 
preformed liposomes in the second step. 

25 

Regulatory DNA and nuclear matrix-attached DNA 

In a further embodiment of the present invention, the genes in plasmid DNA 
are driven by regulatory DNA sequences isolated from nuclear matrix-attached DNA 
using shotgun selection approaches. 
30 The compact structural organization of chromatin and the proper spatial 

orientation of individual chromosomes within a cell are partially provided by the 
nuclear matrix. The nuclear matrix is composed of DNA, RNA and proteins and 
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serves as the site of DNA replication, gene transcription, DNA repair, and 
chromosomal attachment in the nucleus. Diverse sets of DNA sequences have been 
found associated with nuclear matrices and is referred to as matrix attachment 
regions or MARs. The MARs serve many functions, acting as activators of gene 
5 transcription, silencers of gene expression, insulators of transcriptional activity, 
nuclear retention signals and origins of DNA replication. Current studies indicate 
that different subsets of MARs are found in different tissue types and may assist in 
regulating the specific functions of cells. The presence of this complex assortment 
of structural and regulatory molecules in the matrix, as well as the in situ localization 

10 of DNA replication and transcription complexes to the matrix strongly suggest that 
the nuclear matrix plays a fundamental, unique role in nuclear processes. The 
structuring of genomes into domains has a functional significance. The inclusion of 
specific MAR elements within gene transfer vectors could have utility in many 
experimental and gene therapy applications. Many gene therapy applications require 

15 specific expression of one or more genes in targeted cell types for prolonged time 
periods. MARs within vectors could enhance transcription of the introduced 
transgene, prolong the retention of that sequence within the nucleus or insulate 
expression of that transgene from the expression of a cotransduced gene (reviewed 
by Boulikas, 1995; Bode et al, 1996). 

20 Various biochemical procedures have been used to identify regulatory 

regions within genes. Traditionally, identification and selection of regulatory DNA 
sequences depend on tedious procedures such as transcription factor footprinting in 
vitro or in vivo, or subcloning of smaller fragments from larger genomic DNA 
sequences upstream of reporter genes. These methods have been used primarily to 

25 identify regions proximal to the 5' end of genes. However, in many instances, 

regulatory regions are found at considerable distances from the proximal 5* end of 
the gene, and confer cell type- or developmental stage- specificity. For example, 
studies from the groups of Grosveld and Engel (Lakshmanan et al., 1999) have 
shown that over 625 kb of genomic sequences surrounding the GATA-3 locus are 

30 required for the correct developmental expression of the gene in transgenic mice. 

Extensive DNA stretches at distances 5-20 kb upstream of the gene were found to be 
responsible for the central nervous system-specificity of expression. The region 
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between 20 to 130 kb upstream of the gene harbored regulatory regions for 
urogenital-specific expression of GATA-3, whereas sequences 90-180 kb 
downstream of the gene conferred endocardial-specific expression. 

The presently disclosed method has the potential of rapidly identifying 
5 regulatory control regions. In cells, chromatin loops are formed and different 
attachment regions are used in different cell types or stages of development to 
modulate the expression of a gene. The presently disclosed method for isolating 
regulatory regions based on their attachment to the nuclear matrix can identify 
regulatory regions irrespective of their distance from the gene. Although the human 
10 genome project is expected to be almost complete by the year 2000, information on 
the location and nature of the vast majority of the estimated 500,000 regulatory 
regions will not be available. 

Example 1 

15 Plasmid DNA condenses with various agents, as well as various formulations 

of cationic liposomes. The condensation affects the level of expression of the 
reporter beta-galactosidase gene after transfection of K562 human erythroleukemia 
cell cultures. Liposome compositions are shown in the Table below and in FIG. 2. 
All lipids were from Avanti Polar Lipids (700 Industrial Park Drive, Alabaster, AL 

20 35007). The optimal ratio of lipid to DNA was 7 nmoles total lipid/fig DNA. The 
transfection reagent (10 |ag DNA mixed with 70 nmoles total lipid) was transferred 
to a small culture flask followed by the addition of 10 ml K562 cell culture (about 2 
million cells total); mixing of cells with the transfection reagent was at 5-10 min 
after mixing DNA with liposomes. Cells were assayed for beta-galactosidase 

25 activity several times at 1-30 days post-transfection. The transfected cells were 
maintained in cell culture as normal cell cultures. 

Best results were obtained when the cells used for transfection were at low 
number, not near confluence. In all experiments the transfection material was added 
directly in the presence of serum and antibiotics without removal of the transfection 

30 reagent or washings of the cells. This simplifies the transfection procedure and is 

suitable for lymphoid and other type of cell cultures that do not attach to the dish, but 
grow in suspension. All DNA condensing agents were purchased from Sigma. They 
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were suspended at 0.1 mg/ml in water. Plasmid pCMVP was purchased from 
Clbntech and was purified using the Anaconda kit of Althea Technologies (San 
Diego, CA). PolyK is polylysine, mw 9,400. PolyR is polyarginine. PolyHis 
polyhistidine. 

5 To 1 00 |il plasmid solution (10 |ig total plasmid DNA) 20 jj.1 or 50 nl of 

polyK, polyR, polyH, were added; the volume was adjusted to 250 jj.1 with water 
followed by addition of about 70 j-il liposomes (7 nmoles l\xg DNA). After 
incubation for 10 min to 1 h at 20°C the transfection mixture was brought in contact 
with the cell culture. The best DNA condensing reagent was polyhistidine compared 
10 with the popular polylysine. The best cationic lipid was DC-cholesterol (DC-CHOL: 
3(3 [N-(N , 5 N'-dimethylaminoethane)carbamoyl]cholesterol). SFV is Semliki Forest 
virus expressing beta-galactosidase. The results are shown in FIG. 2. 



Liposome 

L2 

L3 
L4 

L5 
L6 



Molecular weight 

DDAB mw 631 
DOPE mw 744 

DOGS-NTA mw 1015.4 

DC-Choi (mw 537) 
DOPE (mw 744) 



DOTAP (mw 698) 
DOPE (mw 744) 

DODAP (mw 648) 



Composition 

DDAB 4.2 ^moles/ml 
DOPE 4.2 nmoles/ml 

DOGS-NTA 1 nmole/ml 
DOPE 1 fimole/ml 
DC-Choi 1 nmole/ml 
DOPE 1 nmole/ml 



DOTAP 1.4 nmole/ml 
DOPE 1.3 nmole/ml 

DODAP 1.54 ^moles/ml 
DOPE 1.3 jimole/ml 



Preparation 

15 mg DDAB 

+ 0.88 ml 20 mg/ml 

DOPE 

5 mg DOGS 

0.185 ml DOPE 

0.106 ml DC-Choi (25 

mg/ml) 

+ 0.185 ml DOPE (20 
mg/ml) 

0.5 ml 10 mg/ml DOTAP 
+ 0.25 ml DOPE (20 
mg/ml) 

0.5 ml 10 mg/ml 
DODAP=5 mg=7.72 
jamoles 

+ 0.25 ml DOPE (20 
mg/ml) 



15 Example 2 

Targeting Genes to Tumors Using Gene Vehicles (Lipogenes). 

As shown in FIG. 3, tumor targeting in SCID (severe combined 
immunodeficient) mice were implanted subcutaneously, at two sites, with human 
MCF-7 breast cancer cells. The cells were allowed to develop into large, measurable 
20 solid tumors at about 30 days post-inoculation. Mice were injected 

intraperitoneously with 0.2 mg plasmid pCMVP DNA (size of the plasmid is ~4 kb) 
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per animal carrying the bacterial beta-galactosidase reporter gene. Plasmid DNA 
(200|ag, 2.0 mg/ml, 0.1 ml ) was incubated for 5 min with 200jxl neutral liposomes of 
the composition 40% cholesterol, 20% dioleoylphosphatidylethanolamine(DOPE), 
12% palmitoyloleoylphosphatidylcholine (POPC), 10% hydrogenated soy 
5 phosphatidylcholine (HSPC), 10% distearoylphosphatidylethanolamine (DSPE), 5% 
sphingomyelin (SM), and 3% derivatized vesicle-forming lipid M-PEG-DSPE. 

At this stage, weak complexation of plasmid DNA with neutral (zwitterionic) 
liposomes takes place. This ensures homogeneous distribution of plasmid DNA to 
liposomes at the subsequent step of addition of cationic liposomes. After 

10 complexation of plasmid DNA with zwitterionic liposomes, 50 pi of cationic 

liposomes (DC-Choi 1 jxmole/ml:DOPE 1.4 pmole/ml) were added and incubated at 
room temperature for 10 min. At this stage, a mixed liposome population is present 
and, most likely, formation of a type of liposome-DNA complexes containing lipids 
from the zwitterionic and cationic lipids takes place. The material was injected (0.35 

15 ml total volume) to the intraperitoneal cavity of the animal. At 5 days post-injection 
the animal was sacrificed, the skin was removed and the carcass was incubated into 
X-gal staining solution for about 30 min at 37°C. The animal was incubated in 
fixative in X-gal staining for about 30 min (addition of 100 \il concentrated 
glutaraldehyde to 30 ml X-gal staining solution) and the incubation in staining 

20 solution continued. Photos were taken in a time course during the incubation period 
revealing the preferred organs where beta-galactosidase expression took place. 

Because of the tumor vasculature targeting shown in FIG. 3E, the data imply 
that transfer of the genes of angiostatin, endostatin, or oncostatin to the tumors 
(whose gene products restrict vascular growth and inhibit blood supply to the tumor) 

25 is expected to be a rational approach for cancer treatment. Also, a combination 
therapy using anticancer lipogenes with encapsulated drugs into tumor targeting 
liposomes appears as a rational cancer therapy. 

It is to be understood that while the invention has been described in 
conjunction with the above embodiments, that the foregoing description and the 

30 following examples are intended to illustrate and not limit the scope of the invention. 
Other aspects, advantages and modifications within the scope of the invention will 
be apparent to those skilled in the art to which the invention pertains. 
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Table 3 Simple NLS 



Signal oligopeptide 


Protein and features 


PKICKRKV (SEQ ID NO:20) 


Wild-type SV40 large T protein 

A point mutation converting lysine-128 (double underlined) to 
threonine results in the retention of targe T in the cytoplasm. Transfer 
of this peptide to the N-terminus of p-galactosidase or pyruvate kinase 
at the gene level and microinjection of plasmids into Vero cells 
showed nuclear location of chimeric proteins. 


PKKKRMV (SEQ ID NO:21) 


SV40 large T with a K— >M change. Site-directed mutagenesis only 
slightly impaired nuclear import of large T. 


PKKKRKVEDP (SEQ ID 
NO:22) 


Svnthetic NLS ceo tide from SV40 lar^e T antigen ern^slinkeH to "R^A 
or IgG mediated their nuclear localization after microinjection in 
Xenopus oocytes. The PKKGSKKA from Xenopus H2B was 
ineffective and PKTKRKV was less effective. 


CGYGPKKKRKVGG (SEQ ID 
NO:23) 


Synthetic peptide from SV40 large T antigen conjugated to various 
proteins and microinjected into the cytoplasm of TC-7 cells. Specified 
nuclear localization up to protein sizes of 465 kD (ferritin). IgM of 970 
kD and with an estimated radius of 25-40 nm was retained in the 
cytoplasm. 


CYDDEATAD SQH STPPKKK 
RKVEDPKDFESELLS 
(SEQ ID NO:24) 


SV40 large T protein long NLS. The long NLS but not the short NLS, 
was able to localize the bulky IgM (970 kD) into the nucleus. 
Mutagenesis at the four possible sites of phosphorylation (double 
underlined) impaired nuclear import. 


CGGPKKKRKVG 
(SEQ ID NO:25) 


SV40 large T protein. This synthetic peptide crosslinked to chicken 
serum albumin and microinjected into HeLa cells caused nuclear 
localization. 


PKKKIKV (SEQ ID NO:26) 


A mutated (R-»I) version of SV40 large T NLS. Effective NLS. 


MKxi 1 CRLKKLKCSKEKPKC 
AKCLKX5RX3KTKR (SEQ ID 
NO:27) 

74 N-terminal amino acid 


Yeast GAL4 (99 kD). Fusions of the GAL4 gene portion encoding the 
74 N-terminal amino acid with E. coli P-galactosidase introduced into 
yeast cells specify nuclear localization. 


MKx 1 1 CRLKKLKCSKEKPKC 

A (SEQ ID NO:28) 

29 N-terminal amino acid 


Yeast GAL4. Acted as an efficient nuclear localization sequence when 
fused to invertase but not to p-galactosidase introduced by 
transformation into yeast cells. 


PKKARED (SEQ ID NO:29) 
VSRKRPR (SEQ ED NO:30) 


Polyoma large T protein. Identified by fusion with pyruvate kinase 
cDNA and microinjection of Vero African green monkey cells. 
Mutually independent NLS. Can exert cooperative effects. 


CGYGVSRKRPRPG 
(SEQIDNO:31) 


Polyoma virus large T protein. This synthetic peptide crosslinked to 
chicken serum albumin and microinjected into HeLa cells caused 
nuclear localization. 


APTKRKGS 
(SEQ ID NO:32) 


SV40 VP1 capsid polypeptide (46 kD). NLS (N terminus) determined 
by infection of monkey kidney cells with a fusion construct containing 
the 5' terminal portion of SV40 VP1 gene and the complete cDNA 
sequence of poliovirus capsid VP1 replacing the VP1 gene of SV40. 


APKRKSGVSKC(l-ll) 
(SEQ ID NO:33) 


Polyoma virus major capsid protein VP1 (11 N-terminal amino acid). 
Yeast expression vectors coding for 17 N-terminal amino acid of VP1 
fused to p-galactosidase gave a protein that was transported to the 
nucleus in yeast cells. Subtractive constructs of VP1 lacking A 1 to C 1 1 
were cytoplasmic. This, FITC-labeled, synthetic peptide crosslinked to 
BSA or IgG, caused nuclear import after microinjection into 3T6 cells. 
Replacement of K^ with T did not. 
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Signal oligopeptide 


Protein and features 


PNKKKRK (SEQ ID NO:34) 
(amino acid position 317-323) 


SV40 VP2 capsid protein (39 kD). The 3' end of the SV40 VP2-VP3 
genes containing this peptide when fused to poliovirus VP1 capsid 
protein at the gene level resulted in nuclear import of the hybrid VP1 
in simian cells infected with the hybrid SV40. 


EEDGPQKKKRRL (307-3 1 8) 
(SEQ ID NO:35) 


Polyoma virus capsid protein VP2. A construct having truncated VP2 
lacking the 307-318 peptide transfected into COS-7 cells showed 
cytoplasmic retention of VP2. The 307-3 1 8 peptide crosslinked to 
BSA or IgG specified nuclear import following their microinjection 
into NIH 3T6 cells. 


GKKRSKA (SEQ ID NO:36) 


Yeast histone H2B. This peptide specified nuclear import when fused 
to P-galactosidase. 


KRPRP (SEQ ID NO:37) 


Adenovirus El a. This pentapeptide, when linked to the C-terminus of 
E. coli galactokinase, was sufficient to direct its nuclear accumulation 
after micro injection in Vero monkey cells. 


CGGLS SKRPRP (SEQ ID 
NO:38) 


Adenovirus type 2/5 El a. This synthetic peptide crosslinked to chicken 
bovine albumin and microinjected into HeLa cells caused nuclear 
localization. 


LVRKKRKTE3SP (NLS 1) 
(SEQ ID NO:39) 
LKDKDAKKSKQE (NLS2) 
(SEQ ID NO:40) 


Xenopus Nl (590 amino acid). Abundant in X. laevis oocytes, forming 
complexes with histones H3, H4 via two acidic domains each 
containing 21 and 9 (D+E), respectively. The NLS1 is required but not 
sufficient for nuclear accumulation of protein Nl. NLS 1 and 2 are 
contiguous at the C-teirninus. 


GNKAKRQRST 
(SEQ ID NO:41) 


v-Rel or p59 v " re * the trarisforrning protein, product of the v-rel 
oncogene of the avian reticuloendotheliosis retrovirus strain T (Rev-T). 
v-Rel NLS added to the normally cytoplasmic p-galactosidase directed 
that protein to the nucleus. 


PFLDRLRRDQK 
(SEQ ID NO:42) 
PKQKRKMAR 


NS1 protein of influenza A virus, that accumulates in nuclei of virus- 
infected cells. Detenriined to be an NLS by deletion mutagenesis of 
NS1 in recombinant SV40. The 1st NLS is conserved among all NS1 
pro terns of influenza A viruses. 


SVTKKRKLE (SEQ ID NO:44) 


Human lamin A. Dimerization of lamin A was proposed to give a 
complex with two NLSs that was transported more efficiently. 


(SEQ ID NO:45) 


Xenopus lamin A. NLS inferred from its similarity to human lamin A 
NLS. 


TKGKRKRID 

(SEQ ID NO:46) | 


Xenopus lamin Lj . NLS inferred from its sequence similarity to 
human lamin A NLS. 


CVRTTKGKRKRIDV 
(SEQ ID NO:47) 


Xenopus lamin Lt. This synthetic peptide crosslinked to chicken 
bovine albumin and microinjected into HeLa cells caused nuclear 
localization. 


ACIDKRVKLD 
(SEQ ID NO:48) 


Human c-myc oncoprotein. This synthetic peptide crosslinked to 
chicken bovine albumin and microinjected into HeLa cells caused 
nuclear localization. 


a rinvD \rv r r> 
AClUfvK V Jvl_,U 

(SEQ ID NO:49) 
(Ml, fully potent NLS) 

RQRRNELKRSP 

(SEQ ID NO:50) 

(M2, medium potency NLS) 


Human c-myc oncoprotein. Conjugation of the Ml peptide to human 
serum albumin and microinjection of Vero cells gives complete 
nuclear accumulation. M2 gave slower and only partial nuclear 
localization. 


SALIKKKKKMAP 
(SEQIDNO:51) 


Murine c-abl (IV) gene product. The plfiOBlK^-* 1 has a cytoplasmic 
and plasma membrane localization, whereas the mouse type IV c-abl 
protein is largely nuclear. 
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Signal oligopeptide 


Protein and features 


PPKKRMRRRIE 
(SEQ ID NO:52) 
PKKKKKRP (SEQ ID NO:53) 


Adenovirus 5 DBP (DNA-binding protein) found in nuclei of infected 
cells and involved in virus replication and early and late gene 
expression. Both NLS are needed, and disruption of either site 
impaired nuclear localization of the 529 amino acid protein. 


YRKCLQAGMNLEARKTKK 
KIKGIQQATA (497-524 amino 
acid) 

(SEQIDNO:54) 


Rat GR, glucocorticoid receptor (795 amino acid) NLS 1 determined by 
fusion with (3-galactosidase (1 16 kD). NLS1 is 100% conserved 
between human, mouse and rat GR. Whereas the 407-615 amino acid 
fragment of GR specifies nuclear location, the 407-740 amino acid 
fragment was cytoplasmic in the absence of hormone, indicating that 
sequence 615-740 may inhibit the nuclear location activity. A second 
(NLS2) is localized in an extensive 256 amino acid C- terminal 
domain. NLS 2 requires hormone binding for activity. 


RKDRRGGRMLKHKRORDD 


Human ER (estrogen receptor, 595 amino acid) NLS. NLS is between 
the hormone-binding and DNA-binding regions; ER, in contrast with 
GR, lacks a second NLS. Can direct a fusion product with 0- 
galactosidase to the nucleus. 


GEGRGEVGSAGDMRAMIN 
O ACIDNLWPSPLMKRSKK 
(amino acid 256-303) 
(SEQ ID NO:55) 


RKFKKFNK 
(SEQ ID NO:56) 


Rabbit PG (progesterone receptor). 100% homology in humans; F— >L 
change in chickens. When this sequence was deleted, the receptor 
became cytoplasmic but could be shifted into the nucleus by addition 
of hormone; in this case the hormone mediated the dimerization of a 
mutant PG with a wild type PG molecule. 


GKRKNKPK (SEQ ID NO:57) 


Chicken Etsl core NLS. Within a 77 amino acid C- terminal segment 
90% homologous to Ets2. When deleted by deletion mutagenesis at the 
gene level the mutant Etsl became cytoplasmic. 


PLLKKIKQ (SEQ ID NO:58) 


c-myb gene product; directs puruvate kinase to the nucleus. 


PPQKKIKS (SEQ ID NO:59) 


N-myc gene product; directs puruvate kinase to the nucleus. 


PQPKKKP (SEQ ID NO:60) 


p53; directs puruvate kinase to the nucleus. 


SKRVAKRKL 
(SEQ ID NO:61) 


c-erb-A gene product; directs puruvate kinase to the nucleus. 


CGGLSSKRPRP 
(SEQ IDNO:62) 


Adenovirus type2/5 El a. This synthetic peptide conjugated with a 
bifunctional crosslinker to chicken serum albumin (CSA) and 
microinjected into HeLa cells directed CSA to the nucleus. 


MTGSKTRKHRGSGA 
(SEQ ID NO:63) 
MTGSKHRKHPGSGA 
(SEQ ID NO:64) 


Yeast ribosomal protein L29. Double-stranded oligonucleotides 
encoding the 7 amino acid peptides (underlined) and inserted at the N- 
terminus of the P-galactosidase gene resulted in nuclear import. 


RHRKHP (SEQ ID NO:65) 
KRRKHP (SEQ ID NO:66) 
KYRKHP (SEQ ID NO:67) 
KHRRHP (SEQ ID NO:68) 
KHKKHP (SEQ ID NO:69) 
RHLKHP (SEQ ID NO:70) 
KHRKYP (SEQ ID NO:71) 
KHRQHP (SEQ ID NO:72) 


Mutated peptides derived from yeast L29 ribosomal protein NLS, 
found to be efficient NLS. The last two are less effective NLS, 
resulting in both nuclear and cytoplasmic location of p-galactosidase 
fusion protein. 


PETTVVRRRGRSPRRRTPSP 
RRRRSPRRRRSQS (SEQ ID 
NO:73) 

(One sequence, C-tenninus) 


Double NLS of hepatitis B virus core antigen. The two underlined 
arginine clusters represent distinct and independent NLS. Mutagenesis 
showed that the antigen fails to accumulate in the nucleus only when 
both NLS are simultaneously deleted or mutated. 


ASKSRKRKL 
(SEQIDNO:74) 


Viral Jun, a transcription factor of the AP-1 complex. Accumulates in 
nuclei most rapidly during G2 and slowly during Gl and S. The cell 
cycle dependence of viral but not of cellular Jun is due to a C->S 
mutation in NLS of viral Jun. This NLS conjugated to rabbit IgG can 
mediate cell cycle-dependent translocation. 
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Signal oligopeptide 


Protein and features 


GGLCSARLHRHALLAT 
(SEQ ID NO:75) 


Human T-cell leukemia virus Tax trans-activator protein. The most 
basic region within the 48 N-terminal segment. Missense mutations in 
this domain result in its cytoplasmic retention. 


DTREKKKFLKRRLLRLDE 

(604-620) 

(SEQ ID NO:76) 


Mouse nuclear Mxl protein (72 kD), Induced by interferons (among 
20 other proteins) . Selectively inhibits influenza virus mRNA 
synthesis in the nucleus and virus multiplication. The cytoplasmic Mx2 
has R-»S and R-^E changes in this region. 


CGYGPKKKRKV (SV40 large 

T) (SEQ ID NO:77) 

CGYGDRNKKKKE (human 

retinoic acid receptor) 

(SEQIDNO:78) 

CGYGARKTKKKIK 

(human glucocorticoid receptor) 

(SEQIDNO:79) 

CGYGIRKDRRGGR 

(human estrogen receptor) 

(SEQ ID NO:80) 

CGYGARKLKKLGN 

(human androgen receptor) 

(SEQIDNO:81) 


Synthetic peptides crosslinked to bovine serum albumin (BSA) and 
introduced into MCF 7 or HeLa S3 cells with viral co-internalization 
method using adenovirus serotype 3B induced nuclear import of BSA. 


RKRQRALMLRQAR 
30-42 

(SEQ ID NO:82) 


Human XPAC (xeroderma pigmentosum group A complementing 
protein) involved in DNA excision repair. By site-directed 
mutagenesis and immunofluorescence. NLS is encoded by exon 1 
which is not essential for DNA repair function. 


EYLSRKGKLEL (SEQ ID 
NO:83) 

(at the N-terminus) 


T-DNA -linked VirD2 endonuclease of the Agrobacterium 
tumefaciens tumor-inducing (TO plasmid. A fusion protein with P- 
galactosidase is targeted to the nucleus. The T-plasmid integrates into 
plant nuclear DNA; VirD2 produces a site-specific nick for T 
integration. VirD2 also contains a bipartite NLS at its C-terminus (see 
Table 2). 


KKSBCKKRC (SEQ ID NO:84) 
(95-102) 


Putative core NLS of yeast TRM1 (63 kD) that encodes the tRNA 
modification enzyme N 2 , N 2 -dimethylguanosine-specific tRNA 
methyltransferase. Localizes at the nuclear periphery. The 70-213 
amino acid segment of TRM1 causes nuclear localization of P- 
galactosidase fusion protein in yeast cells. Site-directed mutagenesis of 
the 95-102 peptide resulted in its cytoplasmic retention. TRM1 is both 
nuclear and mitochondrial. The 1-48 amino acid segment specifies 
mitochondrial import. 


PQSRKKLR (SEQ ID NO:85) 


Max protein; specifically interacts with c-Myc protein. Fusion of 126- 
151 segment of Max to chicken pyruvate kinase (PK.) gene, including 
this putative NLS, followed by transfection of COS-1 cells and indirect 
immunofluorescence with anti-PK showed nuclear targeting. 


QPQRYGGGRGRRW (SEQ ID 
NO:86) 


Gag protein of human foamy retrovirus; a mutant that completely lacks 
this box exhibits very little nuclear localization; binds DNA and RNA 
in vitro. 
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Table 4 "Bipartite" or "split" NLS 



Signal Oligopeptide 


protein and features 


C-terrninus 


Xenopus nucleoplasmin. Deletion analysis demonstrated the 
presence of a signal responsible for nuclear location. 


TKKAGQAKKK (SEQ ID NO:87) 


Xenopus nucleoplasmin 


TKKAGQAKKKKLD 
(SEQ ED NO:88) 


Xenopus nucleoplasmin. Whereas these 17 amino acids had NLS 
activity, shorter versions of the 1 7 amino acid sequences were 
unable to locate pyruvate kinase to the nucleus. 


TKKAGQAKKK(KLD) 
(SEQ ID NO:89) 


Xenopus nucleoplasmin. This 14 amino acid segment was 
identified as a minimal nuclear location sequence but was unable 
to locate puruvate kinase to the nucleus; three more amino acids 
at either end (shown in parenthesis) were needed. 


CGQAKKKKLD 
(SEQ ID NO:90) 


Xenopus nucleoplasmin-derived synthetic peptide; crosslinked to 
chicken serum albumin and microinjected to HeLa cells specified 
nuclear localization. This suggests that nucleoplasmin may 
possess a simple NLS. 


KRP AMINO ACID 
TKKAGOAKKKK (SEO ID NO:91) 


Xenopus nucleoplasmin bipartite NLS. Two clusters of basic 
amino acids (underlined) separated by 1 0 amino acid are half 
NLS components. 


HRKYEAPRHx 6 PRKR (SEQ ID 
NO:92) 


Yeast L3 ribosomal protein (387 amino acid) N-terminal 21 
amino acid. Possible bipartite NLS. (Ribosomal proteins are 
transported to the nucleus to assemble with nascent rRNA). 
Fusion genes with P-galactosidase were used to transform yeast 
cells followed by fluorescence staining with b-gal antibody. The 
373 amino acid of L3 fused to P-gal failed to localize to the 
nucleus, unless a 8 amino acid bridge containing a proline was 
inserted between L3 and P-gal. 


NKKKRKLSRGSSQKTKGTSASAK 
ARHKRRNRSSRS (one sequence) 
(SEQIDNO:93) 


SV40 Vp3 structural protein. (35 amino acid C-terminus). By 
DEAE-dextran-mediated transfection of TC7 cells with mutated 
constructs. 


RVTIRTVRVRRPPKGKHRK 
(SEQ ID NO:94) 


Simian sarcoma virus v-sis gene product (p28 sls ). The cellular j 
counterpart c-sis gene encodes a precursor of the PDGF B-chain 
(platelet-derived growth factor). The NLS is 100% conserved 
between v-sis gene product and PDGF. This protein is normally 
transported across the ER; introduction of a charged amino acid 
within the hydrophobic signal peptide results in a mutant protein 
that is translocated into the nucleus. Puruvate kinase-NLS fusion 
product is transported less efficientiy than cytoplasmic v-sis 
mutant proteins to the nucleus. 


KRKIEEPEPEPKKAK 
(SEQ ID NO:95) 


Putative bipartite NLS of Xenopus laevis protein factor xnf7. 
Inferred by similarity to the bipartite NLS of nucleoplasmin. 
During oocyte maturation xnf7 is cytoplasmic until mid-bias tula- 
gastrula stage due to high phosphorylation. Partial 
dephosphorylation results in nuclear accumulation. 


KKYENWIKRSPRKRGRPRKD 
(SEQ ID Non- 


Yeast SWI5 gene product, a transcription factor. Underlined 
basic amino acid show similarity to bipartite NLS of Xenopus 
nucleoplasmin. The SWI5 gene is transcribed during S, G2 and 
M phases, during which the SWI5 protein remains cytoplasmic 
due to phosphorylation by CDC28-dependent histone HI kinase 
at three serine residues two near and one (double underlined) in 
the NLS. Translocated at the end of anaphase/Gl due to 
dephosphorylation of NLS. NLS confers cell cycle-regulated 
nuclear import of SWI5- P-galactosidase fusion protein. 
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Signal Oligopeptide 


Protein and features 


MKRKRNS 735-741 
(SEQIDNO:97) 

GIESEDNVMGMIGILPDMTPSTEM 
SMRGVRISKMGVDETSSAEKIV 
449-495 (SEQ ID NO:98) 


Bipartite NLS of influenza virus polymerase basic protein 2 
(PB2). Mutational analysis of PB2 and transfection of BHK cells 
showed that both regions are involved in nuclear import. 
Deletion of 449-495 region gives perinuclear localization to the 
cytoplasmic side. 


AHRARRLH (SEQ ID NO:99) 
6-13 (BSI) 

PPRRRVRQQPP (SEQ ID NO: 100) 
23-33 (BSII) 

PARARRRRAP (SEQ ID NO: 101) 
39-48 (BSm) 


"Tripartite" or "doubly bipartite" NLS of adenovirus DNA 
polymerase (AdPol). BSI and II functioned interdependently as 
an NLS for the nuclear targeting of AdPol, for which BSIII was 
dispensable. BSII-III was more efficient NLS than BSI-IL 


KRKxi iKKKSKK 207-226 
(SEQ ED NO: 102) 


Human poly(ADP-ribose) polymerase (116 kD). The linear 
distance between the two basic clusters is not crucial for NLS 
activity in this bipartite NLS. Lysine 222 (double underlined) is 
an essential NLS component. DNA binding and poly(ADP- 
ribosyl)ating active site are independent of NLS. 


GRKRAFHGDDPFGEGPPDKKGD 
(SEQ ID NO: 103) 


Herpes simplex virus ICP8 protein (infected-cell protein). This 
C-teiminal portion of ICP8 introduced into pyruvate kinase (PK) 
caused nuclear targeting in transfected Vero cells. Inclusion of 
additional ICP8 regions to PK led to inhibition of nuclear 
localization. 


KRPREDDDGEPSERXRARDDR 
(SEQ ID NO: 104) 


Bipartite NLS of VirD2 endonuclease of rhizogenes strains of 
Agrobacterium tumefaciens. Within the C-terminal 34 amino 
acid. Each region (underlined) independently directs p- 
glucuronidase to the nucleus, but both motifs are necessary for 
maximum efficiency. VirD2 is tightly bound to the 5' end of the 
single stranded DNA transfer intermediate T-strand transferred 
from Agrobacterium to the plant cell genome. 
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Table 5. "Nonpositive NLS" lacking clusters of arginines/lysines 



Signal oligopeptide 


Protein and features 


OLVWMACNSAMIN 

o 

ACIDFEDLRVLSFERGTKVS 


Influenza virus nucleoprotein (NP). The underlined region 
(327-345) when fused to chimpanzee ai-globin at the cDNA level and 
microinjected into Xenopus oocytes specifies nuclear localization. 


PRG 327-356 

(SEQIDNO:105) 


MNKIPIKDLLNPO 
(NLS1 atN-terminus) (SEQ ID 
NO:106) 

VRILESWFAKNIEN 
PYLDT (NLS2 at amino acid 
141-159, part of the 
homeodomain) 

(SEQ ID NO: 107) 


Yeast MAT a2 repressor protein, containing a homeodomain. 
The two NLS are distinct, each capable of targeting P-galactosidase to 
the nucleus. However, deletion of NLS2 results in a2 accumulation at 
the pores. NLS1 and 2 may act at different steps in a localization 
pathway. Part of the homeodomain mediates nuclear localization in 
addition to DNA binding. The core pentapeptide containing proline and 
two other hydrophobic amino acids flanked by lysines or arginines 
(underlined) was suggested as one type of NLS core. 


RX7KX15KIPRX3HFY 
EERLSWYSDNED (SEQ ID 
NO: 108) 

152-206 (C-tenriinal 

segment) 


Drosophila HP1 (206 amino acids) that binds to 
heterochromatin and is involved in gene silencing. NLS identified by p- 
galactosidase/HPl fusion proteins introduced by P-element mediated 
transformation into Drosophila embryos. 


FVx7- 
2QMxSLxYMx4MF 


Adenovirus type 5 El A internal, developmentally-regulated 
NLS. This NLS functions in Xenopus oocytes but not in somatic cells. 
This NLS can be utilized up to the early neurula stage. 
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Table 6. Nucleolar localization signals (NoLS) 



Signal oligopeptide 


Protein and features 


MPKTRRRPRRSORKRPPTP 
(SEQIDNO:109) 


Nucleolus localization signal in amino terminus of human p27 x " 
m protein (also called Rex) of T cell leukemia virus type I 
(HTLV-I). When this peptide is fused to N-tenninus of P- 
galactosidase, directs it to the nucleolus. Deletion of residues 2- 
8 (underlined), 12-18 (double-underline) or substitution of the j 
central RR (dotted-underlined) with TT abolish nucleolar 
localization. Other amino acids between positions 20-80 
increase nucleolar localization efficiency. 


RLPVRRRRRRVP (SEQ ID NO: 1 10) 


Adenovirus pTPl and pTP2 (pretenninal proteins, 80 kD) 
between amino acid residues 362-373. The 140 kD DNA 
polymerase of adenovirus when it has lost its own NLS can 
enter the nucleus via its interaction with pTP. The staining was 
nuclear and nucleolar with some perinuclear staining as well. 
The NLS fused to the N-terminus of E. coli P-galactosidase was 
functional in nuclear targeting. 


GRKKRRQRRRP 
(SEQIDNOilll) 


HIV (human immunodeficiency virus) Tat protein; localizes 
pyruvate kinase to the nucleolus. Tat is constitutively nucleolar. 


RKKRRQRRR(AHQ) 
Nucleolar localization signal 
(SEQ ID NO: 112) 


Tat positive trans-activator protein of HIV- 1 (human 
immunodeficiency virus type 1). The 3 amino acids shown in 
parenthesis are essential for the localization of the p- 
galactosidase to the nucleolus. The 9 amino acid basic region is 
able to localize p-gal to the nucleus but not to the nucleolus. 


KRVKLDQRRRP (SEQ ID NO: 113) 


Artificial sequence from c-Myc and HIV Tat NLSs that 
effectively localizes pyruvate kinase to the nucleolus. 


FKRKHKKDISQNKRAVRR 
(SEQIDNO:114) 


Human HSP70 (heat shock protein of 70 kD); localizes pyruvate 
kinase to the nucleus and nucleolus. HSP70 is physiologically 
cytoplasmic but with heat-shock HSP70 redistributes to the 
nucleoli, suggesting that the nucleolar targeting sequence is 
cryptic at physiological temperature and is revealed under heat- 
shock. 


RQARRNRRRRWRERQR (35-50) 
(SEQ ID NO: 115) 


HIV- 1 Rev protein (116 amino acid, nucleolar). Mutations in 
either of the two regions of arginine clusters severely impair 
nuclear localization. P-galactosidase fused to R4W was targeted 
to the nucleus, and fused to the entire 35-50 region, was targeted 
to the nucleolus. 


ROARRNRRRRWRERORO (35-5 1) 
(SEQ ID NO: 116) 


HIV-1 Rev protein. A fusion of this Rev peptide with P- 
galactosidase became nuclear but not nucleolar. The 1-59 amino 
acid segment of Rev fused to P-galactosidase localized entirely 
within the nucleolus. Whereas the NRRRRW (bold) is 
responsible for nuclear targeting, the RR and WRERQRQ 
(double underlined) specify nucleolar localization. Rev may 
function to export HIV structural mRNAs from the nucleus to 
the cytoplasm. 
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Table 7. Karyophilic clusters on non-membrane protein kinases 



Karyophilic peptides 


Non-membrane 
protein kinase 


Species 


Features 


73 FWHKRCHE 
(SEQIDNO:117) 
96 DDPRSKHKFKIH 
(SEQIDNO:118) 
577 TKHPGKRLG 
(SEQroNO:119) 


Protein kinase C (673 
aa) 


Bovine, human 
Ptype 


Known to translocate to the 
nucleus following treatment of 
cells with mitogens. 


71 FWHRRCHEF 

(SEQIDNO:120) 

95 DDPRNKHKFRLH 

(SEQIDNO:121) 

591 TKHPAKRLG 

(SEQIDNO:122) 


Protein kinase C (697 
aa) 


bovine, human y 
type 




72 FWHKRCHE 
(SEQ ID NO: 123) 
96 DDPRSKHKFKIH 
(SEQ ID NO: 124) 
577 TKHPGKRLG 
(SEQ ID NO: 125) 


Protein kinase C (673 
aa) 


rabbit type a and 
P 




71 FWHRRCHE 
(SEQ ID NO: 126) 
95 DDPRNKHKFRLH 
(SEQ ID NO: 127) 
594 TKHPGKRLG 
(SEQ ID NO: 128) 


PKC-I (701 aa) 


rat brain 




22 GENKMKSRLRKG 
(not conserved) 
(SEQ ID NO: 129) 
80SYWHKRCHEYVT 
(conserved) 
(SEQ ID NO: 130) 
21 1PDDKDQSKKKTR 
TIK (not conserved) 
(SEQ ID NO: 131) 
6 1 4PPFKPKIKHRKMC 
P (not conserved) 
(SEQ ID NO: 132) 


Protein kinase C 
(639aa, 75 kDa) 


Drosophila 


14 exons, 20 kb; 3 transcripts in 
adult flies; not expressed in 0-3h 
Drosophila embryos; the 
WHKRCHE (SEQ ID 
NO:133)motif (or WHRRCHE 
(SEQ ID NO: 134)) is conserved 
among all PKC known. 


148 KKVLQDKRFK 
NRELQIMRKLD (SEQ 
ID NO: 135) 


Glycogen synthase 
kinase 3 
GSK-3a 
(483 aa) 

GSK-3P 
(420 aa) 


rat brain 


Phosphorylates glycogen synthase, 
c-Jun, c-Myb; two isoforms 
encoded by discrete genes; highly 
expressed in brain; both a and p 
forms are cytosolic but also 
associated with the plasma 
membrane consistent with their 
role in signal transduction from the 
cell surface. 


LQDRRFKNRELQ 
(SEQ ID NO: 136) 


Zw3 

zeste- white 3 


Drosophila 


Product of the segment polarity 
gene zw3; the protein encoded has 
34% homology to cdc2; mutations 
in zw3 give embryos that lack 
most of the ventral denticles, 
differentiated structures derived 
from the most anterior region of 
each segment. 
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Karyophilic peptides 


Non-membrane 
protein kinase 


Species 


Features 


289ECLKKFNARRKL 
KGAIL 

(SEQIDNO:137) 


Ca /calmodulin- 
dependent protein 
kinase II (CaM kinase 
II) p subunit (542aa, 
60.3 kDa) 


rat brain 


Composed of nine 50 kDa a- 

subunits and three 60 kDa p- 

subunits; both are catalytic; 

calmodulin- and ATP -binding 

domains; highly expressed in 

forebrain neurons, concentrated in 

postsynaptic densities; acts as a 
2+ 

Ca -triggered switch and could be 
involved in long-lasting changes in 
synapses. 


290LKKFNARRKL 
KGAILTTM (SEQID 
NO: 138) 

450EETRVWHRRDGK 
(SEQID NO: 139) 


CaM kinase II (478 
aa, 54 kDa) 
a-subunit 


rat brain 


This particular isoform is 
exclusively expressed in the brain; 
high enzyme levels in specific 
brain areas; might be involved in 
short- and long-term responses to 
transient stimuli. 


185 GFAKRVKGRT 
WTLCG 

(SEQID NO: 140) 


CADPK catalytic 
subunit (349 aa, 40.6 
kDa) 


bovine (cardiac 
muscle) 


By Edman degradation of protein 
fragments; mediates the action of 
and is activated by cAMP; consists 
of two regulatory (R) and two 
catalytic (C) subunits; cAMP 
releases the C subunit from the 
inactive R2C2 cADPK; two 
cDNAs were cloned encoding two 
isoforms of the catalytic subunit of 
cADPK in mouse. 


186 GFAKRVKGRTW 
TLCG 

(SEQIDNO:141) 


CADPK 

(catalytic subunit) 
(350 aa) 


bovine 


cDNA was isolated by screening a 
bovine pituitary cDNA library; 
93% sequence similarity to known 
bovine cADPK; represents the 
second gene for the catalytic 
subunit of cADPK. 


29 EEEIQELKRKLH 
KCQSVLP (SEQ ID 
NO: 142) 

389 KILKKRHIVDTR 
(SEQ ID NO: 143) 


CGDPK (SEQ ID 
NO: 144) 

(670 aa, 76.3 kDa) 


bovine lung 


By protein sequencing; composed 
of two identical subunits activated 
in an allosteric manner by binding 
of cGMP and not by dissociation 
of catalytic subunit as in cADPK; 
sequence similar to cADPK 


117 KTLKKHTI VK 
(SEQ ID NO: 145) 


TPK3 
(398 aa) 
cADPK 


S, cerevisiae 


cAMP-DPK is a tetrameric protein 
with two catalytic and two 
regulatory subunits; cAMP 
activates the kinase by dissociating 
the catalytic subunits from the 
tetramer; all three TPK 1, 2, 3 are 
catalytic subunits. 


I6S2H13GHG2 
1 66 EYCHRHKIVHRD 
LKP (SEQ ID NO: 146) 
495 PLVTKKSKTRWH 
FG (SEQ ID NO: 147) 


SNF1 (633aa, 72 kDa) 


S. cerevisiae 


Ser/Thr kinase; 
autophosphorylated; plays a 
central role is carbon catabolite 
repression in yeast required for 
expression of glucose-repressible 
genes; region 60-250 shows high 
sequence similarity to cAMP- 
dependent protein kinase 
(cADPK). 
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Karyophilic peptides 


Non-membrane 
protein kinase 


Species 


Features 


70 PVKKKKKREIK 
(SEQIDNO:148) 
269 DILQRHSRKRW 
ERF (SEQ ID NO: 149) 
146 PKSSRHHHTDG 
(SEQ ID NO: 150) 


Casein kinase II (a- 
subunit, catalytic) 
(336aa) 

CKII (P-subunit, 
regulatory) (215 aa) 


Drosophila 
melanogaster 

Drosophila 
melanogaster 


CKII is composed of a and p 
subunits in a ct2p2 130-150 kDa 
protein; the a-subunit is the 
catalytic and the p is 
autophosphorylated. 


142 PKSSRHHHTDG 
(SEQIDNO:151) 


CKII (p-subunit, 
regulatory) (209aa, 
24.2 kDa) 


bovine (lung) 




108 PKQRHRKSLG 
(SEQ ID NO: 152) 
129 GSMCKVKLAK 
HRYTNE 
(SEQ ID NO: 153) 
506 DRKHAKIRNQ 
(SEQ ID NO: 154) 
638 GNIFRKLSQRR 
KKTIEQ 

(SEQ ID NO: 155) 
773 PPLNVAKGRKL 
HP (SEQ ID NO: 156) 


KIN1 (1064 aa, 117 
kDa) 


S. cerevisiae 


30% aa similarity to bovine 
cADPK and 27% (KIN1) or 25% 
(KIN2) aa similarity to v-Src 
within the kinase domain; the 
catalytic domains of KIN 1 and 
KIN2 are near the N-terminus and 
are structural mosaics with features 
characteristic of both Tyr and , 
Ser/Thr kinases. 


87 ELRQFHRRSLG 
(SEQ ID NO: 157) 
111 GKVKLVKHRQ 
TKE (SEQ ID NO:158) 
2 1 7 GSLKEHHARKF 
ARG(SEQIDNO:159) 
807 LSVPKGRKLHP 
(SEQ ID NO: 160) 


KIN2(1152 aa, 126 
kDa) 


S. cerevisiae 




60FLRRGKKKLTLD 
(SEQ ID NO: 161) 
472 PSKDDKFRHWC 
RKIKSKIKEDKRIKRE 
(SEQ ID NO: 162) 


STE7 (515aa) 


S. cerevisiae 


Implicated in the control of the 
three cell types in yeast: (a, a , and 
a/a) of which a and a cells are 
haploid and are specialized for 
mating whereas a/a cells are 
diploid and are specialized for 
meiosis and sporulation; with the 
exception of the mating type locus, 
MAT, all cells contain the same 
DNA sequences. STE7 gene 
produces insensitivity to cell- 
division arrest induced by the yeast 
mating hormone, a-factor. 


722 QRRVKKLPSTTL 
(SEQ ID NO: 163) 
QRRVKKLPSITL 
(SEQ ID NO: 164) 


S6KIIa (733aa) 
S6KIIP 


Xenopus 
Xenopus 




742 QRRVKKLPSTTL 
(SEQ ID NO: 165) 


S6KII (752 aa) 


Chicken 




7 1 3 QRRVRKLPSTTL 
(SEQ ID NO: 166) 


S6KII (724aa) 


Mouse 
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Karyophilic peptides 


Non-membrane 
protein kinase 


Species 


Features 


1 6GWYKGRHKTTG 
(SEQn>NO:167) 
120 FCHSRRVLHRD 
LKP (SEQIDNO:168) 


CDC2Hs 

(297aa) 

p 3 4 cdc2 


Human 


Isolated by expressing a human 
cDNA library in £ pombe and 
selecting for clones that 
complement a mutation in the cdc2 
yeast gene; the human CDC2 gene 
can complement both the 
inviability of a null allele of £ 
cerevisiae CDC28 and cdc2 
mutants of £ po?nbe; CDC2 
mRNA appears after that of 
CDK2. 


GWYKARHKLSGR 
(SEQ ID NO: 169) 


cdc2 (297aa) 


S. pombe 


High homology to £ cerevisiae 
CDC28. 


1 1 9HSHRVLHRDLKP 
(SEQIDNO:170) 


CDK2 (cell division 
kinase 2) (298 aa) 


Human 


The human CDK2 protein has 65% 
sequence identity to human 
p34 cdc2 and 89 o /o sequence 
identity to Xenopus Egl kinase; 
human CDK2 was able to 
complement the inviability of a 
null allele of £ cerevisiae CDC28 
but not cdc2 mutants in £ pombe. 
CDK2 mRNA appears in late 
Gl/early S. 


109 FCHSHRVLHRD 
LKP(SEQIDNO:171) 


Egl (297aa) 


Xenopus 


Cdk2-related 


125 GIAYCHSHRILH 
RDLKP 

(SEQIDNO:172) 


CDC28 (298a) 


S. cerevisiae 


The homolog of £ pombe Cdc2 


119 HSHR VIHRDLKP 
(SEQ ID NO: 173) 


cdk3 (305aa) 


Human 




56 KELKJHKNIVR 
(SEQ ID NO: 174) 


PSSALRE (291 aa) 
(SEQ ID NO: 175) 


Human 


cdc2 -related kinase. 


1 MDRMKKIKRQ (N- 
terminus) (SEQ ID 
NO: 176) 

141 DKPLSRRLRRV 
(SEQ ID NO: 177) 


PCTAIRE-1 (496 aa) 


Human 


cdc2-related kinase. 


1 MKKFKRR 
(SEQ ID NO: 178) 
129 RNRIHRRIS 
(SEQ ID NO: 179) 
172 SRRSRRAS 
(SEQ ID NO: 180) 
304 HRRKVLHR 
(SEQIDNO:181) 
512 GHGKNRRQSM 
LF (SEQ ID NO: 182) 


PCTAIRE-2 (523 aa) 


Human 


cdc2 related kinase. 


163 HTRKILHR 
(SEQ ID NO: 183) 
369 PGRGKNRRQSIF 
(SEQ ID NO: 184) 


PCTAIRE-3 
(380 aa) 


Human 


cdc2 related kinase. 
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Karyophilic peptides 


Non-membrane 
protein kinase 


Species 


Features 


69 EVFRRKRRLH 
(SEQIDNO:185) 
302 DKPTRKTLRKSR 
KHH (SEQ ID NO: 186) 


KKIALRE (358 aa) 
(SEQ ID NO: 187) 


Human 


cdc2-related kinase. 


1 MVKRHKNT 

(SEQ ID NO: 188) 

87 DGELFHYIRKHGP 

(SEQ ID NO: 189) 

1 14 DAVAHCHRFRFR 

HRD (SEQ ID NO: 190) 

Z7J ISJVoooiSJV V VJKJvU 

QQRDD 

(SEQIDNO:191) 


nirnl 4 " gene product 
(new inducer of 
mitosis); protein 
kinase (370 aa) 


S. pombe 




194 PAQKLRKKNNFD 
(SEQ ID NO: 192) 
388 KQHRPRKNTNFT 
PLPP(SEQIDNO:193) 
592 KYAVKKLKVKF 
SGP (SEQ ID NO: 194) 


Weel + gene product 
(877aa) 


S. pombe 


The Weel + gene functions as a 
dose-dependent inhibitor that 
delays the initiation of mitosis 
until the yeast cell has attained a 
certain size; Weel has a protein 
kinase consensus probably 
regulating cdc2 kinase. 


266 PNETRRIKRAN 
RAG (SEQ ID NO: 195) 


CDC7 (497 aa) 


S. cerevisiae 


Required for mitotic but not 
meiotic DNA replication 
presumably to phosphorylate 
specific replication protein factors; 
implicated in DNA repair and 
meiotic recombination; some 
homology with CDC2S and 
oncogene protein kinases but 
differs in a large region within the 
phosphorylation receptor domain. 


4 8 YDH VRKTR VAIKK 
(SEQ ID NO: 196) 


ERK1 (MAP kinase) 
(367 aa; 42 kDa) 


Rat 


Known to translocate to the 
nucleus following their activation 
by phosphorylation at T-190, and 
Y-192 (T-183, Y-185 inERK2). 


59 ILKHFKHE 
(SEQ ID NO: 197) 


FUS3 (353aa) 


S. cerevisiae 


MAP-(ERKl)-related. 


252 QIKSKRAKEY 
(SEQ ID NO: 198) 


KSS1 (368 aa) 


S. cerevisiae 


MAP-(ERK 1 )-related. 


ELVKHLVKHGSN 
(SEQ ID NO: 199) 
GKAKKIRSQLL 
(SEQ ID NO:200) 


SWI6 

(803aa, 90kDa) 


S. cerevisiae 


Activator of CACGA-box with 
sequence similarity to cdclO; 
required at START of cell cycle. 


EQRLKRHRID VSDED 
(MiQ ID NO:201) 
SNIKSKCRRW 
(SEQIDNO:202) 


cdclO 


S. pombe 




37 PPKRIRTD 
(suggested by the 
authors) (SEQ ID 
NO:203) 

492 KLARKQKRP 
(SEQ ID NO:204) 


CTD kinase (528 aa) 
58 kDa subunit 
(catalytic) 


5. cerevisiae 


Consists of 3 subunits of 58, 38, 
and 32 kDa; disruption of the 58 
kDa gene gives cells that lack CTD 
kinase, grow slowly, are cold 
sensitive, but have different 
phosphorylated forms of RNA pol 
II. 
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Karyophilic peptides 


Non-membrane 
protein kinase 


Species 


Features 


29 GVSSWRRCIHKP 
(SEQ ID NO:205) 


Phosphorylase kinase 
(catalytic subunit) 
(386aa) 


Rabbit (skeletal 
muscle) 




489 KKYMARRKW 
QKTGHAV 
(SEQ ID NO:206) 


Myosin light chain 
kinase (MLCK) (669 
aa) 


Chicken gizzard 


Ca2 + /calmodulin-activated; 
phosphorylated by cADPK; first 
described as responsible for the 
phosphorylation of a specific class 
of myosin light chains; required 
for initiation of contraction in 
smooth muscle. 


3 1 4 P WLNNL AEKAK 
RCNRRLKSQ 
(SEQ ID NO:207) 
334 ILLKKYLMKRR 
WKKNFIAVS 
(SEQ ID NO:208) 


Myosin light chain 
kinase (partial 368 
carboxy-terminal aa 
sequence) 


Rabbit (skeletal 
muscle) 


By protein sequencing. 


28 GVSSWRRCIHKP 
(SEQ ID NO:209) 


Phosphorylase kinase 
(PhK) (catalytic y 
subunit) (389 aa) 


Mouse (muscle) 


Glycogenolytic regulatory enzyme; 
undergoes complex regulation; 
composed of 16 subunits 
containing equimolar ratios of a, (3, 
y and 8 subunits; high levels in 
skeletal muscle; isoforms in 
cardiac muscle and liver; cDNA 
probe does not hybridize to X 
chromosome in mice and is thus 
distinct from the mutant recessive 
PhK deficiency that results in 
glycogen storage disease. 
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Table 8. Nuclear localization signals on DNA repair proteins 



Putative NLS 


Gene 
product 


Equivalent protein 
in other species 


Features 


HIGHER EUKARYOTES 








None 

(N-terminus) 

MDPGKDKEGvpqpsgppaRKKF 
(bipartite NLS) 
(SEQIDNO:210) 


ERCC1 


RADIO 


297aa; DBD; interacts 
strongly with ERCC4 (XPF) 
to form an excision 
endonuclease; unless the 
KDKxi jRKK is a bipartite 

iNi-«o li may ucjjciiu ujjuu lib 

binding with ERCC4 for its 
nuclear import. 


None 

68 1 DKRFARGDKRGKLPR 
(near the C-terrninus) (four 
positive , one negative over a 
heptapeptide stretch) 
CSEO IDNO-21 H 


ERCC2 
(XPD) 


RAD3 (S. cer) 


760 aa; DNA helicase 
component of TFIIH, 
essential for cell viability; 
contains one nucleotide- 
binding, one DNA-binding, 
ana seven aomains 
characteristic of helicases; 
52% identity with S. cer 
RAD3 at the amino acid 
level. 


8 DRDKKKSRKRHYEDEE 

(SEQEDNO:212) 

522 YVAIKTKKRILLYTM 

fSEO ID NO-2131 

(weak NLS if at all, hydrophobic 

environment) 

769 PSKHVHPLFKRFRK 

/pen tt-\ \Tn . o i a\ 


ERCC3 
(XPB) 


SSL2 (S cer) 
Haywire(Dros) 


782 aa; helicase, component 
of TFIIH essential for cell 
viability; helix-turn-helix, 
uri/\-Du, ana nencase 
domains 


84 KKQTLVKRRQRKD 

(SEQIDNO:215) 

210 EFTKRRRTL 

(SEQ ID NO:216) 

390 DESMIKDRKDRLP 

(SEQIDNO:217) 

1 170 GKKRRKLRRARGRK 

RKT (SEQIDNO:218) 


ERCC5 
(XPG) 


RAD2; 
Radl3 


1 186 aa in human, 1 196 in X. 
laevis; 3' incision 
endonuclease; involved in 
homologous recombination; 
strongly nuclear 


253POKOEKKPRKIMLNEASG 

(SEQIDNO:219) 

314 PNKKARVLSKKEERLKK 

HIKKLOKR f SEO ID NO:220) 


ERCC6 
CS-B 


RAD26 

\ 


1493aa; involved in the 
preferential repair of active 
genes; nonessential for ceil 
viability 


406 PLPKGGKRQKKVP 

(SEQIDNO:221) 

455 DGDEDYYKORLRRWNK 

LRLQDKEKRLKLEDDSEESD 

(SEQ ID NO:222) 

1028 DVQTPKCHLKRRIQP 

XrPKRKKFP (SEO ID NO:223) 

1 180 KHKSKTKHHSVAEEETL 

EKHLRPKQKPKX 1 5PHLVKK 

RRY(SEQIDNO:224) 

1324 PAGKKSRFGKKRN 

(SEQ ED NO:225) 
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Putative NLS 


Gene 
product 


Equivalent protein 
in other species 


Features 


21 PASVRASIERKRORALM 
LRGAR (SEQ ID NO:226) 
160 PPLKFIVKKNPHHSOW 
GD (weak) (SEQ ID NO:227) 
210 NREKMKOKKFDKKVKE 
(weak because of F) 
(SEQ ID NO:228) 


XPA 


RAD14 


273 aa; zinc finger domain; 
involved in lesion 
recognition 


72 YLRRAMKRFN (weak) 

(SEQ ID NO:229) 

262 PSAKGKRNKGGRKKRSK 

PSSSEEDEGPG (SEQ ID 

NO:230) 

297 QRRPHGRERR (weak) 

(SEQIDNO:231) 

368 RTHRGSHRKDP (weak) 

(SEQ ID NO:232) 

384 SSSSSSSKRGKKMCSDG 

(SEQ ID NO:233) 

531 ALKRHLLKYE (weak) 

fSEO ID NO -234^ 

594 SNRARKARLAEP 

(SEQ ID NO:235) 

660 PNLHRVARKLD (weak) 

(SEQ ID NO:236) 

716 ERKEKEKKEKR 

(SEQ ID NO:237) 

740 IRERLKRRYG 

(SEQ ID NO:238) 

801 GGPKKTKRERK 

(SEQ ID NO:239) 


XPC 


RAD4 (23% identity, 
44% similarity) 


823 aas, 92.9 kDa; very 
hydrophilic protein; might be 
involved in lesion 
recognition since XPC cells 
(40% of all XP cases) can 
repair active parts of the 
genome whereas inactive and 
the nontranscribed strand of 
active genes are not repaired 


20 KSKAKSKARREEEEED 

(SEQ ID NO:240) 

54 GKRKRG (SEQ ID NO:241) 

69 GPAKKKVAKVTVK 

(SEQ ID NO:242) 

103 PSDLKKAHHLKRG 

(SEQ ID NO:243) 


XPC 




940 aa; the first 1 1 7 aa are 
lacking in the Legerski and 
Peterson, (1992) XPC 
sequence (see above); the 
following 823aa are 
identical. 


82 EIDRRJCKRPLEND GP VKK 
KVKKVQQKE (SEQ ID 
NO:244) 

375 KENVRDKKKG 

(SEQ ID NO:245) 

571 FGRRKLKKWVT 

(SEQ ID NO:246) 

710 PLIKKRKDEIQG 

(SEQ ID NO:247) 

1091 KELEGLINTKRKRLKYF 

AKLW (SEQ ID NO:248) 


Rep-3 
(mouse) 
Duc-1 
(HeLa) 


Swi4 (Sporri) 


113 7aa; mismatch repair 
protein; Rep-3 is in the 
immediate 5* flanking region 
of DHFR gene (89 bp) but 
transcribed from the opposite 
strand; a bidirectional 
promoter is used for both 
transcripts. 


422 EKHEGKHQKLL (weak) 
(SEQ ID NO:249) 


hMSH2 


MSH2 (S cer) 


human mismatch repair 
protein; homologous to S. 
cerevisiae MSH2; associated 
with the hereditary 
nonpolyposis colon cancer 
gene on chromosome 2pl6. 
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Putative NLS 


Gene 
product 


Equivalent protein 
in other species 


Features 


397 PDIRRLTKKLNKRG 

(SEQ ID NO:250) 

547 DAKELRKHKKYIE 

(SEQIDNO:251) 

869 VKMAKRKANE 

(SEQ ID NO:252) 


MSH2 
( S cerl 






95 GELAKRSERRAEAE 

(SEQ ID NO:253) 

354 KRKEPEPKGSTKKKAK 

TG (SEQ ID NO:254) 

394 GKFKRGK (SEQ ID 

NO:255) 


Human Rad2 


Rad2 (S. pom) 


400 aa; required for fidelity 
of chromosome separation at 
mitosis; limited similarity to 
RAD2 (ssDNA nuclease), 
radl3, and XPG (ERCC5). 


None 


mouse 
RAD51 




339 aa; recombination-repair 
protein; 83% homology to S 
cerevisiae RAD51 and 55% 
homology to E. coli RecA. 


None 


HHR23B 
/p58 


RAD23 


Subunit of XPC H25 kDa} 


None 


HHR23A 


RAD23 


Subunit of XPC (125 kDa) 


32 PSQAEKKSRARAQ 
(SEQ ID NO:256) 


RPA (34 kDa 
subunit) 




RPA (70, 34, and 14 kDa 
subunits) might stabilize the 
helicase-melted DNA around 
the lesion; antibodies against 
RPA 32 kDa subunit inhibit 
DNA replication. 


GAKKRKIDDA 
(SEQIDNO:257) 


ATPase Ql 


RecQ (£. coli) 


649 aa; altered in XPC cells; 
undetermined role in repair 


PKKPRGKM (SEQ ID NO:258) 
EHKKKHP (SEQ ID NO:259) 
ETKKKFKDP (SEQ ID NO:260) 
EKSKKKK(E/D)4i (SEQ ID 
NO:261) 

E3G2KKKKKFAK (SEQ ID 
NO:262) 


HMG-1 




Calf thymus HMG 1 
(259 aa); involved in the 
recognition of cisplatin 
lesions 


5 1 2 RDEKKRKOLKKAKAK 
MAKDRKSRKKP ( SEO ID 
NO:263) 

619 GESSKRDKSKKKKKVKV 
KMEKK (SEQ ID NO:264) 
674 GENKSKKKRRRSEDSEE 
EE(SEQIDNO:265) 


SSRP1 


ABF (Seer) 


709 aa, 81 kDa, structure- 
specific recognition protein 
1 ; involved in recognition of 
cisplatin-induced lesions; 
also involved in Ig gene 

rpcnrnViinntinn • nnp TTA/f f~r- 

box, similarity to SRY, 
MTFII, LEF-1, TCF-la, and 
ABF2. 


1 MPKRGKKG (SEQ ID 
NO:266) 


Ref-1 
(HAPl) 




Redox factor 1 from HeLa 
cells; 37 kDa, 318 aa; 
apurmic/apyrirmdinic (AP) 
endonuclease for DNA repair 
but also of redox activity 
stimulating Jun/Fos DNA 
binding. 


1 MPKRGKKG 
(SEQ ID NO:267) 


HAPl 
(bovine) 


ExoIII 
(E. coli) 
Exo A (5. 
pneumoniae) 


323 aa; apurinic/apyrimidinic 
(AP)-endonuclease 
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Putative NLS 


Gene 
product 


Equivalent protein 
in other species 


Features 


DROSOPHILA 








1 MGPPKKSRKDRSGGDKF 
GKKRRGQDE 
(SEQ ID NO:268) 
EMSYSRKRQRFLVNQG 
(weak) (SEQ ID NO:269) 
YYEHRJCKNIGSVHPLFK 
KFRG (bipartite) (SEQ ID 
NO:270) 


Haywire 


ERCC3 (XPB) 
SSL2 (S cer) 


helicase with 66% identity to 
human ERCC3; flies 
expressing marginal levels of 
Haywire display motor 
defects and reduced life span 


77 ARGKKKQPK (SEQ ID 
NO:271) 

98 KPKGRAKKA (SEQ ID 
NO:272) 

157 QAKGRKKKELP (SEQ ID 
NO:273) 

179 EPPKQRARKE (SEQ ID 
NO:274) 

241 PPKAASKRAKKGK (SEQ 
ID NO:275) 

282 PKKRAKKTT (SEQ ID 
NO:276) 

317 EPAPGKKQKKSAD (SEQ 
ID NO:277) 

336 EEEAKPSTETKPAKGR 
KKAP (SEQ ID NO:278) 
372 KPARGRKKA (SEQ ID 
NO:279) 

394 GSKTTKKAKKAE 
(SEQ ID NO:280) 


Rrpl 


HAP1 


Recombination repair protein 
1); 679 aa; the 252 aa C- 
terminal domain is 
homologous to AP- 
endonucleases, whereas the 
1-426 aa domain is highly 
charged, carries all of the 
putative NLSs. 


S. CEREVISIAE 








200 IEKRRKJLYISGG 

(SEQIDNO:281) 

515 NKKRGVRQVLLN (SEQ 

ID NO:282) 

565 KEQVTTKRRRTRG 
(conserved in Radl6) (SEQ ID 
NO:283) 

1024 NLRKKIKSFNKLQ 
(SEQ ED NO:284) 


RAD1 


ERCC4 

(XPF) 

Radl6 


1 100 aa; 30% sequence 
identity to Radl6; RAD1 
interacts strongly with 
RADIO 


89 RQRKERRQGKRE 

(SEQ ID NO:285) 

907 ENKFEKDLRKKLVNNE 

(SEQ ID NO:286) 

984 RDVNKRKKKGKQKRI 

(SEQ ID NO:287) 

1017 KRISTATGKLKKRKM 

(SEQ ID NO:288) 


RAD2 


XPGC 
Radl3 


1031 aa, 1 17.8 kDa; ssDNA 
endonuclease; rad mutants 
are defective in incision 


672 GKDDYGVMVLADRRF 
SRKRSQLP (contains the bulky 
F) (SEQ ID NO:289) 


RAD3 
(S. cer) 


ERCC2orXPD; 
Radl5 orRhp3 


778 aa, 89,779 Da; 30% 
sequence identity to rad 16; 
ATP-dependent DNA 
helicase; single-stranded 
DNA-dependent ATPase. 
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Putative NLS 


Gene 
product 


Equivalent protein 
in other species 


Features 


26 PLSRRRRVRRKNQPLPD 

AKKKFKTG (SEQ ID NO:290) 

1 34 NEERKRRKYFHMLYL 

(SEQ ID NO;291) 

160 EWINSKRLSRKLSNL 

(weak) (SEQIDNO:292) 

254 EMSANNKRKFKTLKRSD 

weak (SEQ ID NO:293) 

382 WMNSKVRKRRITKDDF 

GEK(SEQIDNO:294) 

403 RKVITALHHRKRTKID 

DYED (SEQ ID NO:295) 

504 KTGSRCKKVIKRTVGRP 

(SEQ ID NO:296) 


RAD4 


XPC 


754 aa; mutations in RAD4 
that that inactivate the 
excision repair function of 
RAD4 result in truncated 
proteins missing the C~ 
terminal one-third of RAD4. 


150 FHPKRRJRJYGFR (SEQ ID 
NO:297) 

215 DSRGRKKASM (SEQ ID 
NO:298) 

297 DGESLMKRRRTEGGNK 
REK(SEQIDNO:299) 
1 1 52 DEDERRKRRIEE 
(SEQ ID NO:300) 


RAD5 




1169 aa; helicase involved in 
postreplication-repair (RAD 6 
epistasis group); binds DNA 
with the seven helicase 
motifs and with zinc fingers; 
increases the instability of 
poly (GT) repeats in the yeast 
genome. 


1 MSTPARRRLMRDFKRM 
KEDAPP (SEQ ID NO:301) 


RAD6 




RAD6 mediates the 
ubiquitination of H2A and 
H2B histones 


15 GVAKLRKEKSGAD 
(SEQ ID NO:302) 
76 DDYNRKRPFRSTRPGK 
(SEQ ID NO:303) 


RADIO 


ERCC1 


210 aa; forms an 
endonuclease with RAD1; 
the basic and tyrosine-rich 
central domain was 
suggested to bind DNA by 
ionic interactions and 
tyrosine intercalation. 


172 EGKAHRREKKYE 
(SEQ ID NO:304) 
200 NRLREKKHGKAHIHH 
(SEQ ID NO:305) 


RAD14 


XPAC 


247aa, 29.3 kDa; two zinc 
fingers; involved in lesion 
recognition; 27% sequence 
identity and 54% sequence 
similarity (if conserved 
residues are grouped 
together) to human XPA; 
deletion oi jsj\l> i 1 * gene 
generates high UV 
sensitivity. 


345 ERRKQLKKQGPKRP 
(SEQ ID NO:306) 
479 ETYKKRIKEWESCYPDE 
(SEQ ID NO:307) 


Ixrl 

(& cer) 




591 aa; two consecutive 
HMG boxes; involved in 
recognition of 1,2-intrastrand 
d(GpG) and d(ApG) cisplatin 
crosslinks. 


None 


RAD23 


HHR23 
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Putative NLS 


Gene 
product 


Equivalent protein 
in other species 


Features 


483 LTCKKLKTHNRIILSG 

weak (SEQ ID NO:308) 

934 NALRKSRKKITKQYEIGT 

PX9GEIRKRDP 

(SEQ ID NO:309) 


RAD26 

(yeast 

ERCC6) 


ERCC6 
CS-B (hum) 


1075aa; disruption of the 
RAD26 gene gives viable 
yeast cells unable to 
preferentially repair the 
actively transcribed strands; 
surprisingly, in contrast to 
human CS-B cells, disruption 
of the RAD26 in yeast does 
not cause sensitivity to UV, 
Cisplatin, or X-rays. 


634 KPTSKPKRVRTATKKKIP 

(SEQIDNO:310) 

408 FYKKRSPVTRSKKSG 

(SEQIDNd:311) 


MRE11 


Rad32 (S pom) 


meiotic recombination 
protein; functions in the 
same pathway with RAD5 1 


none; 

361 GFKKGKGCQR 
(SEQIDNO:312) ( 


RAD51 


RecA (E. coli) 


402 aa; essential for repair of 
DSBs and recombination; 
associates strongly with 
RAD52; self associates; 
neither RAD5 1 nor RAD52 
possess a typical simple 
NLS. j 


none; 

328 GFKKGKGCQR 
(SEQIDNO:313) 


RAD51 (K. 
lactis) 




364 aa 


none; 

155 ERAKKSAVTDALKRSLR 
GFGXgDKDFLAKIDKVKFDP 
PD (tripartite) 
(SEQ IDNO:314) 


RAD52 


Rad22 


504 aa; rad52 mutants are 
defective in ionizing 
radiation, mitotic 
recombination, ma ting-type 
switching, and repair of 
DSDs. 


1 MARRRLPDRPP 
(SEQIDNO:315) 
65 GGRSLRKRSA 
(SEQIDNO:316) 
99 OLTKRRKD 
(SEQIDNO:317) 


RAD 54 




898 aa; recombination-repair 
protein; ATP-binding motif; 
helicase domains; in the 
same subfamily of helicases 
with MOT1 and SNF2. 


269 DETVFVKSKRVKASSS 

(extremely weak if at all NLS) 

(SEQIDNO:318) 

317 GEDRKREGRNLKR 

(SEQIDNO:319) 


RAD55 




Similarity to RecA, and 
lower similarity to RAD51, 
RAD57, andDMCl 


j / A JrloKi^oJ\JtNJt<JvrJJ I Kvr 

(SEQ ID NO:320) 






460 aa; nucleotide-binding 
domain; limited similarity to 
RAD51 


62 GLKKPRKKTKSSRH 

(SEQ1DNO.-321) 

688 GRILRAKRRNDEG 

(SEQ ID NO:322) 

784 GRGSNGHKRFKS (weak) 

(SEQ ID NO:323) 


SSL2 


ERCC3 (XPB) 


843 aa; putative helicase that 
seems to function in repair 
but also in the removal of 
secondary structures in the 5' 
untranslated region of mRNA 
to allow ribosome binding 
and scanning. 



73 



WO 01/93836 



PCT/US01/18657 



Putative NLS 


Gene 
product 


Equivalent protein 
in oiner species 


Features 


50 TRRHLCK1KGLSE (weak) 

(SEQIDNO:324) 

277 DGRKPIGGHX12RKGRG 

DER (bipartite) (SEQ ID 

NO:325) 


DMCl 


RecA 


334 aa; yeast homolog of 
RecA, meiosis-specific; 
dmcl mutants are defective 
in reciprocal recombination 
and accumulate DSBs 


1 1 ETEKRCKQKEQRY 
(SEQIDNO:326) 


PMS1 




904 aa, 1 03 kDa; mismatch- 
repair protein; MutL 
(Salmonella) and HexB 
(Streptococcus) homolog 


None 

1 MDLRVGRKFRIGRKIG 
(SEQ ID NO:327) 
139 GRRGX 8 GLSKKYRDFNT 
HRHIP (Bipartite weak NLS) 
(SEQ ID NO:328) 


HRR25 


Hhpl,Hhpl (Spom) 
CKI (mamm 


Mutations in HRR25 Ser/Thr 
protein kinase cause defects 
in DNA repair and 
retardation in cell cycling 


96 HELTKRSSRRVETEK 
(SEQ ID NO:329) 


YKL510 




383 aa; structure-specific 
endonuclease; two domains 
of about 100 aa with 
sequence similarity to N- and 
C-terminal regions of RAD2. 


200 MLAMARRKKKMSAK 

(SEQ ID NO:330) 

617 EHYKVKHTEK (weak 

NLS)(SEQIDNO:331) 

670 LHPEKKRSISE (weak 

NLS) (SEQ IDNO:332) 


MOT1 




Modifier of transcription 1 ; 
1867 aa; DNA helicase of S. 
cerevisiae required for 
viability; increases gene 
expression of several., but 
not all, pheromone- 
responsive genes in the 
absence of STE12; the 1257 
to 1825 aa domain (568 aa 
residues) has homology to 
SNF2 and RAD 54 


S. POMBE 








60 SSIDEx 5 SIKRKRRI (SEQ ID 
NO:333) 


Swi4 


Duc-1 
Rep-3 


1 13 kDa; KCII sites are 
upstream of NLS like in 
SV40 large T; the 
homologous prokaryotic 
MutS and HexA lack NLS 


96 GELAKRVARHQKARE 
(weak NLS) (SEQ ID NO:334) 
362 GSAKRKRDS 
(SEQ ID NO:335) 
372 KGGESKKKR 
(SEQ ID NO:336) 


Rad2 




380 aa 


None 


Rad9 




427 aa; no homology to other 
DNA repair proteins; rad9 
fission yeast mutants are 
sensitive to both UV and 
ionizing radiation; may be 
involved in recombination- 
repair. 


None 

681 DKRYGRSDKRTKLPK 
(SEQ ID NO:337) 


Rhp3or 
radlS 


ERCC2 
RAD3 


772 aa; DNA helicase; 65% 
identity to RAD3 and 55% 
identity to ERCC2; essential 
for viability 
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Putative NLS 


Gene 
product 


Equivalent protein 
in other species 


Features 


464 PPSKRRRVRGG 
(SEQ ID NO:338) 


Radl6 


RAD I 


Function in repair of U V 
damage for both cyclobutane 
dimer and (6-4) photoproduct 
lesions; Radl6 interacts with 
SwilO. 


431 DFKQAILRKRKNESPE 
EVEP (SEQ IDNO:339) 


Rad21 




628 aa, 67.8 kDa, acidic 
protein; a single base 
substitution in mutant rad21- 
45, changing an He into a 
Thr, is responsible for the 
low efficiency in repair of 
DSBs after g-radiation 
although capable of arresting 
at G2. 


490 DKKAKKG (SEQ ID 
NO.340) 


Rad22 


RAD52 


496 aa; functions in 
recombination-repair and 
matina-type switching. 


394 DVVQFYLKKKYTRSKRN 
DG (weak because of Y) (SEQ 
ID NO:341) 

575 PSPALLKKTNKRRELP 
(SEQ IDNO:342) 


Rad32 


MRE11 (S cer) 


648 aa; meiotic 
recombination protein; rad32 
mutants are sensitive to g- 
and UV radiation; functions 
in the same pathway with 
Rho51 (RAD51). 




Rad51 




recom b ination-reoair 


GLAKKYRDHKTHLHIP (weak 
NLS because of Y and H) (SEQ 
ID NO:343) 


Hhpl 


CKI (mamm) 
HRR25 (S cer) 


Ser/Thr protein kinase; 
mutation in this gene causes 
repair defects 


None 

GLAKKYRD^KTHVHIP (H in 
Hhpl is replaced by F in Hhp2) 
rSEO ID NO:344) 


Hhp2 


CKI (mamm) 
HRR25 (S cer) 


Ser/Thr protein kinase; 
mutation in this gene causes 
repair defects 
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Table 9. NLS in Transcription factors 



NLS and Flanks 


Protein factor and features 


highly basic 




TTT> .ADTDV^D /OCA TT~\ 

HK4QKIKJV7K (oiivj JLU 
NO:345) 

LRRKSRP (SEQ ID NO:346) 
SRRTKRRQ (SEQ ED NO:347) 


Human LjCt (UC-iactor; 


GRKRKKRT (SEQ ID NO:348) 


Oct-6 protein transcription factor from mouse cells 


GRRRKKRT (SEQ ID NO:349) 


Mouse Oct-2 protein transcription factors (Oct-2.1 for Oct-2.6 
isoforms) 


ARKRKRT (SEQ ID NO:350) 
NRRQKGKRS (SEQ ID 
NO:351) 


Oct-3 from mouse PI 9 embryonal carcinoma cells 


ECRRKKKE 
(SEQ ID NO:352) 


Human ATF-1. In basic region/leucine zipper. 


ERKKRRRE (SEQ ID NO:353) 
AKCRNKKKEKT (SEQ ID 
NO:354) 


Human ATF-3 (in basic region that binds DN A) 


SKKKIRL (SEQ ID NO:355) 
QKGNRKKM (SEQ ID NO:356) 
VKKVKKKL (SEQ ID NO:357) 


Mouse Pu.l (Friend erythroleukemia cells). Related to ets oncogene 


VKRKKI (SEQ ID NO:358) 
CRNRYRKLE (SEQ ID 
NO:359) 

IRKRRKMK (SEQ ID NO:360) 
PKKKRLRL (SEQ ID NO:361) 


Human PRDII-BF1 that binds to IFN-J3 gene promoter. (The largest 
DNA-binding protein known, of 298 kD). 


GKKKKRKREKL 
(within the HMG-box) 
(SEQ ID NO:362) 


Murine LEF-1 (397 aa). Lymphoid-specific with an HMGl-like box. 
NLS is identical to that of human TCF- la. 


GKKKKRKREKL 
(within the HMG-box) 
(SEQ ID NO:363) 


Human TCF- la (399 aa) 

(T cell-specific transcription factor that activates the T cell receptor 
Ca). Contains an HMG box. NLS core is identical to that of murine 
LEF-1. 


GKKKRRSREKH 

(within the HMG-box) (SEQ ID 

NO:364) 

PKKCRARF (SEQ ID NO;365) 


Human TCF- 1 

(uniquely T cell-specific). HMG box containing. 


FKQRRIKL (SEQ ID NO:366) 
NRRRKKRT (SEQ ID NO:367) 
NRRQKEKRI (SEQ ID NO:368) 


Xenopus laevis Oct- 1 (within POU-domain) 


DKRSRKRKRSK (SEQ ID 
NO:369) 

RLRIDRKRN (SEQ ID NO:370) 
AKRSRRS (SEQ ID NO:371) 


Drosophila Suvar (3) 7 gene product involved in position-effect 
variegation (932 aas). Five widely spaced zinc-fingers could help 
condensation of the chromatin fiber. 
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NLS and Flanks 


Protein factor and features 


IRKRRKMKSVGD2E2 (SEQ ID 
NO:372) 

(not suggested as NLS by the 
authors; between the 1st and 2nd 
zinc finger) 
PPKKKRLRLAE 
(suggested as NLS by the 
authors; just before 2nd zinc 
finger) (SEQ ID NO:373) 

(within 1st zinc finger) 
^onv^ xxj rs\j,j /h ) 


Human MBP-1 (class I MHC enhancer binding protein 1) mw 200 
kD. Induced by phorbol esters and mitogens in Jurkat T cells. 

< 


PRRKRRV (SEQ ID NO:375) 
HRYKMKRQ (SEQ ID NO:376) 


rat TTF-1 (thyroid nuclear factor that binds to the promoter of 
thyroid-specific genes). An homeodomain protein. 


DGKRKRKN (SEQ ID NO:377) 
DDSKRVAKRKL (SEQ ID 
NO:378) 

NRERRRKEE (SEQ ID NO:379) 
WKQRRKF (SEQ ID NO:380) 


Human thyroid hormone receptor a (c-erbA-1 gene). Belongs to the 
family of cytoplasmic proteins that are receptors of hydrophobic 
ligands such as steroids, vitD, retinoic acid, thyroid hormones. The 
ligand binding may expose the NLS for nuclear import of the 
receptor-ligand complex. 


NRRKRKRS (SEQ ID NO:381) 
PKKKKL (SEQ ID NO:382) 


Drosophila gel (germ cell-less) gene product (569 aa, 65 kD), located 
in nuclei, required for germ line formation. 


ARRKRRRL (SEQ ID NO:383) 
LKFKKVRD (SEQ ID NO:384) 
FKKFRKF (SEQ ID NO:385) 
GKQKRRF (SEQ ID NO:386) 
ERLKRDKEKREKE (SEQ ID 
NO:387) 

TRGRPKKVKE (SEQ ID 
NO:388) 

orLKi<AjRKi<JCK I (oEQ ID 
NO:389) 

NO:390) 

SRKSKKRLRA fSEO ID 
NO:391) 


C elegans Sdc-3 protein (sex-determining protein) (2,150 aas). A 
zinc finger protein. 


LKKIRRKIKNKI (SEQ ID 
NO:392) 

ESRRKKKE (SEQ ID NO:393) 


Drosophila BBF-2 (related to CREB/ATF) 










DRNKKKKE (SEQ ID NO:394) 
ARRRRP (SEQ ID NO:395) 


Xenopus RAR (retinoic acid receptor) 


GRRRRA (SEQ ID NO:396) 
DEKRRKV (SEQ ID NO:397) 
CRQKRKV (SEQ ID NO:398) 


Human ATF-2 (the 2nd and 3rd NLS are in basic region that binds 
DNA) 


ERKRRD (SEQ ID NO:399) 
SRKKLRME (SEQ ID NO:400) 


Myn (murine homolog of Max). Forms a specific DNA-binding 
complex with c-Myc oncoprotein through a helix-loop-helix/leucine 
zipper. 


EEKRKRTYE (SEQ ID NO:401) 


human NFkB p65 (550 aa). 

Not binding DNA; complexed with p50 that binds DNA. NFkB p50 
also contains a NLS (Table 3b). 
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IN Lib and Jblanks 


Protein factor and features 


GRRRRA (SEQ ID NU:402) 
DEKRRKF (SEQ ID NO:403) 
SRCRQKRKV (SEQ ID 
NO:404) 


Human HB16, a cAMP response element-binding protein 


SKKKKTKV (SEQ ID NO:405) 
NRPDKKKI (SEQ ID NO:406) 
QRRKKP (SEQ ID NO:407) 
QKKRRFKT (SEQ ID NO:408) 


Human TFIIE-P (general transcription initiation protein factor; forms 
tetramer a2p2 with TFHE-a) 


SRKRKM (SEQ ID NO:409) 


Human kup transcriptional activator (433 aas). Two distantly spaced 
zinc fingers. Expressed in hematopoietic cells and testis. 


ERKRLRNRLA (SEQ ID 
NO:410) 

ATKCRKRKL (SEQ ID 

j\u:4i l ; 

(19 aa stretch) 


Mouse Jun-B homologue to avian sarcoma virus 17 oncogene v-jun 
product. One region is similar to yeast GCN4 and to Fos. 


DKRX6ERKRRD (N-terminus) 
(SEQ ID NO:412) 
QSRKKLRME (C-terminus) 
(SEQIDNO:413) 


Max (specifically associates with c-Myc, N-Myc, L-Myc). The Max- 
Myc complex binds to DNA; neither Max nor Myc alone exhibit 
appreciable DNA binding. 


DKEKKIKLEEDE (within an 
acidic region) (SEQ ID NO:414) 
IKKAKKV (SEQ ID NO:415) 
TRRKKN (SEQ ID NO:416) 


Chicken VBP (vitellogenin gene-binding protein). Leucine zipper. 
Related to rat DBP. 


TRDDKRRA (SEQ ID NO:417) 
EVERRRRDK (SEQ ID 
NO:418) 


Xenopus borealis Bl factor. Closely related to the mammalian USF. 
Binds to CACGTG in TFIIIA promoter to developmentally regulate 
its expression. 


TRDEKRRA (SEQ ID NO:419) 
EVERRRRDK (SEQ ID 
NO:420) 


Human USF (upstream stimulatory factor) activating the major late 
adenovirus promoter 


YRRYPRRRG (SEQ ID 
NO:421) 

QRRPYRRRRF (SEQ ID 
NO:422) 

YRPRFRRG (SEQ ID NO:423) 
QRRYRRN (SEQ ID NO:424) 
YRRRRP (SEQ ID NO:425) 


YB-1, a protein that binds to the MHC class II Y box. YB-1 is a 
negative regulator. 


AI^RQKKD (SEQ ID NO:426) 
ERRRRF (SEQ ID NO:427) 


Human TFEB Binds to IgH enhancer. 


LKERQKKD (SEQ ID NO:428) 
IERRRRFN (SEQ ID NO:429) 
YFRRRJRLEKD (SEQ ID 
NO:430) 


Human TFE3 (536 aa). Binds to uE3 enhancer of IgH genes. 


KTVALKRRKASSRL (SEQ ID 


Human Drl (176 aa, 19 kD). Interacts with TBP (TATA-binding 
protein) thus inhibiting association of TFIIA and/or TFIIB with TBP. 
TBP-Drl association is affected by Drl phosphorylation to repress 
activated and basal transcription. 


1 LRRRGRQTY (SEQ ID 
NO:432) 

27 LTRRRRIEM (SEQ ID 
NO:433) 

51 QNRRMKLKKEI (SEQ ID 
NO:434) 


Drosophila ultrabithorax protein (from the conserved 61 amino acid 
homeodomain segment only). Conserved in the antenappedia 
homeodomain protein. 
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Protein factor and features 


SNRRRPDHR (SEQ ID NO:435) 
VYRGRRRVRRE (SEQ ID 
NO:436) 

P7AP2RRRRSADNKD2 (SEQ 
ID NO:437) 

PKKPRHQF (SEQ ID NO:438) 


C. elegans sex-determining Tra-1 protein. Zinc finger. Peaks in the 
second larval stage. 


EKRKKERN (SEQ ID NO:439) 
LLRRLKKEVE (SEQ ID 
NO:440) 

EPLGRIRQKKRVY2D2 (SEQ 
IDNO:441) 

(EDAIKKRKEARERRRLRQ) 
(SEQ ID NO:442) 
DKETTASRSKRJRSSRKKRT 
(SEQ ID NO:443) 
ESKKKKPKL (SEQ ID NO:444) 
KKTAAKKTKTKS (SEQ ID 
NO:445) 


Yeast NPS1 transcription protein factor (1359 aa) involved in cell 
growth control at G2 phase. Has a catalytic domain of protein 
kinases. 


QRKRQKL (SEQ ID NO:446) 
KAKKQK (SEQ ID NO:447) 
LRRKRQK (SEQ ID NO:448) 


Human 243 transcriptional activator (968 aas), induced by mitogens 
in T cells. N-terminal half is homologous to oncoprotein Rel and 
Drosophila Dorsal protein involved in development. The C-terminal 
half contains repeats found in proteins involved in cell-cycle control 
of yeast and tissue differentiation in Drosophila, 


RDIRRRGKNKV (SEQ ID 
NO:449) 

QNCRKRKXE (SEQ ID 
NO:450) 


Mouse NF-E2 (45 kD), an erythroid transcription factor from mouse 
erythroleukemia (MEL) cells. Involved in globin gene regulation. 
Binds to AP-l-like sites. Homology to Jun B, GCN4, Fos, ATF1 and 
CREB in basic region/leucine zipper (see Fig. 2). 






Group 069x00 




DKIRRKN (SEQ ID NO:451) 
ARKTKKKI (SEQ ID NO:452) 


Human glucocorticoid receptor 


473 DKIRRKNCP (SEQ ID 
NO:453) 

EARKTKKKJKGIQ (SEQ ID 
NO:454) 


Mouse and human GR (glucocorticoid recptor) 






Group 096x0 




YRVRRERN (SEQ ID NO:455) 
VRKSRDKA (SEQ ID NO:456) 

TYRT P'K'P VTh /"CTyrfc TTi XTrV/1 ^7"\ 
JL/I\J-fINJ\J\. V n ^oli^ IU iNU.HJ / J • 


C/EBP (CCAAT/enhancer binding protein). 
Functions in liver-specific gene expression. 


DKIRRKN (SEQ ID NO:458) 
ARKSKKL (SEO ID NO-459^ 


Human mineralocorticoid receptor 


DKIRRKN (SEQ ID NO:460) 
GRKFKKF (SEQ ID NO:461) 


Human PR (progesterone receptor) 


EEVQRKRQKLMP (SEQ ID 
NO:462) 


Human and mouse NFkB 105 kD precursor of p50 (968 aas) (first R 
is at 361 position). 


EEVQRKRQKL (SEQ ID 
NO:463) 


Human NF-kB p50 (DNA-binding subunit). Identical to protein 
KBF1, homologous to rel oncogene product. NF-kB p65 also 
contains a NLS (Table 3a). 


GKTRTRKQ (SEQ ID NO:464) 
ARRKSRD (SEQ ID NO:465) 


Human TEF-1 (SV40 transcriptional enhancer factor 1). 426 aa. 
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QRKERKSKS (SEQ ID NO:466) 
TKSKTKRKL (SEQ ID NO:467) 


Rat, mouse, human IRF-1 (interferon regulatory factor- 1). Induced in 
lymphoma T cells by the pituitary peptide hormone prolactin. 
Regulates the growth-inhibitory interferon genes. 


GKCKKKN (SEQ ID NO:468) 


Ehrlich ascites S-II transcription factor. A general factor that acts at 
the elongation step. 


ERSKKRSRE (SEQ ID NO:469) 
ERELKREKRKQ (SEQ ID 
NO:470) 

ARRSRLRKQ (SEQ ID NO:471) 


Tobacco TAF-1 transcriptional activator 


YKLDHMRRRIETDE (SEQ ID 
NO:472) 


Drosophila TFIIEa (433 aa), a general transcription factor for RNA 
polymerase II. Composed of subunits a and (3. 


DKNRRKS (SEQ ID NO:473) 
IRKDRRG (SEQ ID NO:474) 
IKRSKKN (SEQ ID NO:475) 


Human ER (estrogen receptor); 595 aa. 


EQRRHRIE (SEQ ID NO:476) 
TTRAEKKRLL (SEQ ID 
NO:477) 

IDKKRSKEAKE (SEQ ID 


Yeast ADA2 (434 aa), a potential transcriptional adaptor required for 
the function of certain acidic activation domains. 


EAALRRKIRTISK 
(SEQ ID NO:479) 


Yeast GCN5 gene product (439 aa), required for the function of 
GCN4 transcriptional activator and for the activity of the HAP2-3-4 








r. rnlm QQ Y QQ 

vjroup udaoo 




NKKMRRNRF (SEQ ID 
NO480Y 

NRRKX4RQK (SEQ ID NO:481) 


Mouse LFB3 


1 JSJvOKJvJNxvJr ^oiiv^ WJ 
NO:482) 

1 > ix rvrw'wzj. ivniv ^ocv^ vu iNu.toJ ) 


JVlOuse LrhSl 


NKKMRRNRFK (SEQ ID 
NO:484) 


rat vHNFl-A 


NKKMRRNR fSEO ID 


murine TTMT7 1 R 

murine xhn r - 1 p 


TKKGRRNRF CSFO TD 
NO:486) 


iiiu Uoc iiiN r " l 


NKKMRRNRF /\SFO TD 
NO:487) 


liuitinii viuir x 


TKKGRRNRF (SEQ ID 
NO:488) 


rat liver HNF1 


LRRQKRFK (SEQ ID NO:489) 
QQH3SH4Q (SEQ ID NO:490) 


rat HNF-3P 


LRRQKRFK (SEQ ID NO:491) 


rat HNF-3y 


LRRQKRFK (SEQ ID NO:492) 


rat HNF-3a 


LKEKERKA (SEQ ID NO:493) ! 
MKKARKV (SEQ ID NO:494) 


rat DBP a protein factor that binds to the D site of the albumin gene 
promoter 


PRRERRY (SEQ ID NO:495) 


rat AT-BP1. Highly acidic domain. Two zinc fingers. Binds to the 
B-domain of a i -antitrypsin gene promoter and to the NF-kB site in 
the MHC gene enhancer. 


DRRVRKGKV (SEQ ID 
NO:496) 


A 19 kD Drosophila melanogaster nonhistone associated with 1 
heterochromatin. j 
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NO:497) 


murine £« U h f porlu D pall Faptnr^ nf ^01 oo PArrnlatan +VtA -r\rr^ "D nn^] 

murine cor ^eariy o-ceii iaciorj oi oyi aa. xveguiaies ine pre-D ana 
B lymphocyte-specific mb-1 gene. Expressed in pre-B and B-cell 

lines Hut not in nlasmoevtoma*; T-rpll anH nnnlvmnhoin 1 cpII linpQ 


GRRTRRE (SEQ ID NO:498) 


human Spl 


DEQKRAEKKAKE (SEQ ID 

IRRIHKVTRP (SEQ ID NO:500) 
LLRRLKKDVE (SEQ ID 
NO:501) 


yeast SNF2, a transcriptional regulator of many genes. 






Group 0x99x0 




AKAKAKKA (SEQ ID NO:502) 
YKMRRERN (SEQ ID NO:503) 
VRKSRDKA (SbQ ID NO: 504) 


mouse AGP/EBP (87% similarity to C/EBP), ubiquitously expressed 


AKAKAKKA (SEQ ID NO:505) 
YKMRRERN (SEQ ID NO:506) 
VRKSRDKA (SEQ ID NO:507) 


rat LAP, a 32-kD liver-enriched transcriptional activator, also present 
in lung, with 71% sequence similarity to C/EBP. Leucine zipper. 
Accumulates to maximal levels around birth. 


YRQRRER (SEQ ID NO:508) 
VKKSRLKSKQK (SEQ ID 
NO:509) 


Ig/EBP-1 (immunoglobulin gene enhancer-binding protein). Forms 
heterodimers with C/EBP. 


EDPEKEKRIKELE (SEQ ID 
NO:510) 

IVIJvKJvV ^C^J LLf 1 I ) 


mouse c-Myb 


DYYKVKRPKTD (SEQ ID 

JNLO LZ ) 

GRARGRRHQ (SEQ ID 

FRYRKIKDIY (SEQ ID 
NO:514) 


Drosophila eyes absent protein (760 aa), a nuclear protein that 
functions in early development to prevent programmed cell death and 
to allow the event that generate the eye to proceed. Mutations cause 
programmed cell death of eye progenitor cells. 






Group 0x0x00 






tt — *:rvr->r> . : — — — 

rat IL-6DBP interacting with interleukin-6 responsive elements. Has 

a icuLUit z.ippci uvjmain. 


DKRQRNRC (SEQ ID NO:5 16) 
FkrtirkD 


mouse H-2RIIBP (MHC class I genes H-2 region II binding protein). 

X^emher of thp midpar linrmnnp r^rpntAT Qiioprfamilv 


FkrtirkD 

DKRQRNRC (SEQ ID NO: 5 17) 


chicken RXR, related to RAR (retinoic acid receptor), a nuclear 

nrotein factor from thp tHvroiH/itproid hnrmnnp rpppritor familv 


VKSKAKKT (SEQ ID NO:5 1 8) 
YKIRRERN (SEQ ID NO:519) 
VRKSRDKA (SEQ ID NO:520) 


human NF-IL6 (345 aa). Specifically binds to IL1 -responsive 

element in thp TL-fi opnp T piiriTip yirvnpr Hnmnlrwv to f^/PUP 


QKKNRNKC (SEQ ID NO: 521) 


mouse PPAR (peroxisome proliferator activated receptor) 






Group 000XX00 




EQIRKLVKKHG (SEQ ID 
NO:522) 


yeast RAP 1 

It binds regulatory sites at yeast mating type silencers. 


FRRSMKRKA (SEQ ID 
NO:523) 


human vitamin D receptor (427 aa) 






Group OOxxOO 
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jLfJ\J\^i^Jvtvri ^ori^ i±J lskj.jZh) 


mouse WT1 (the murine homolog of human Wilms' tumor 
predisposition gene WT1) 




human Wl 33 (Wilms tumor predisposition) 






Grout) 009xxe 




LKESKRKYDE (SEQ ID 
NO:526) 


yeast SWI3 99 kD, highly acidic protein. Global transcription 
activator. 


EVLKVQKRRIYD (SEQ ID 
NO:527) 


human RBAP-1 (retinoblastoma-associated protein 1) factor (412 aa). 
A protein that binds to the pocket (functional domain) of the 
retinoblastoma (RB) protein involved in suppression of cell growth 
(tumor suppressor). The transcription factor E2F, implicated in cell 
growth, binds to the same pocket of RB. 
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Table 10. NLS in other nuclear proteins 



Putative NLS 


Protein 


YKSKKKA (SEQ ID NO:528) 
TKKLPRKT (SEQ ID NO:529) 


Yeast L3 


TRKKGGRRGRRL (SEQ ID NO:530) 
C-terminus . 


Yeast 59 ribosomal protein 


ARATRRKRCKG (SEQ ED NO:531) 


Yeast L 1 6 ribosomal protein 


GKGKYRNRRW (SEQ ID NO:532) 


yeast L2 ribosomal protein (homologous to 
Xenopus LI). Encoded by intronless genes. 


GKGKMRNRRRIQRRG (SEQ ID NO:533) 
NKKVKRRELKKN (SEQ ID NO:534) 
AKTARRKA (SEQ ID NO:535) 
IKAKEKKP (SEQ ID NO:536) 
GKPKAKKP (SEQ ID NO:537) 
AKAKKRQ (SEQ ID NO:538) 


Xenopus laevis LI ribosomal protein (homologous 
to yeast L2) Encoded by intronless genes. 


ERKRKS (SEQ ID NO:539) 
GKRPRTKA (SEQ ID NO:540) 
HKRRRI (SEQ ID NO:541) 
LKKQRTKKNKE (SEQ ID NO:542) 


human S6 ribosomal protein (homologous to yeast 
S10) 


PKMRRRTYR (SEQ ID NO:543) 
KKKISQKKLKK (SEQ ID NO:544) 


Rat LI 7 ribosomal protein (184 aas) 


YMRRRTYRA (SEQ ID NO:545) 
EVKKVSKKKL (SEQ ID NO:546) 


Podocoryne carnea (hydrozoan, Coelenteratum) 
LI 7 ribosomal protein (184 aas) highly 
homologous to rat LI 7. 


ERNRKDKDAKFR (SEQ ID NO:547) 


human, rat ribosomal S13 protein 


ERKRKS (SEQ ID NO:548) 
QRLQRKRH (SEQ ID NO:549) 
IRKRRA (SEQ ID NO: 550) 


yeast S10 ribosomal protein (homologous to human 
S6) 


GRRRKKHRSRSRSRERRSRSRDRGRGi 2GRER 
DRRRSRDRER (SEQ ID NO:551) 


35 kD subunit of U2 small nuclear 
ribonucleoprotein auxiliary factor (U2AF), an 

essential mammalian splicing factor. U2AF 35 

interacts with the 65 kD subunit (U2AF 65 ). Both 
proteins are concentrated in a small number of 
subnuclear organelles, the coiled bodies. 


EFEDPRD (SEQ ID NO:552) 
ETREERME (SEQ ID NO:553) 
EAGDAPPDP (SEQ ID NO:554) 
EERMERKRREK (SEQ ID NO:555) 
HRDRDRDRERERRESRERDKERERRRSRSRD 
RRRRSRSRDKEERRRSRERSKDKDRDRKRRS 
SRSRERARRERERKEE (SEQ ID NO: 556) 
RDRDRERRRSHRSERERRRDRDRDRDRDREH 
KRGER (SEQ ED NO:557) 


human UsnRNP-associated 70 k protein (437 aas) 
that is phosphorylated at Arg/Ser-rich domains; 
involved in splicing | 


QKRNNKKSKKKRCAE (SEQ ID NO:558) | 
EKLRKLKI (near C-terminus) (SEQ ID NO:559) 


yeast TRM1 enzyme for the N 2 ,N^- 
dimethylguanosine modification of both 
mitochondrial and cytoplasmic tRNAs. TRM1 is 
both nuclear and mitochondrial. The first motif is 
within a region (70-213 aa segment) known to 
cause nuclear localization of (3-galactosidase. 


NKRKRV (SEQ ID NO:560) 
SLKNRSNRKRE (SEQ ID NO:561) 
EPKRKRRLP (SEQ ID NO:562) 
ARMRHSKR (C-terminus) (SEQ ID NO:563) 


Yeast nucleoporin NUP1 (1076 aa, 1 13 kD); an 
integral component of the pore complex. Involved 
in both binding and translocation steps of nuclear 
import. 
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Putative NLS 


Protein 


KAEKEX3KVD2E2 (SEQ ID NO:564) 
KX3KX5KX3R (SEQ ID NO:565) 


Chicken, Xenopus No 38 nucleolar (38 kD); 
involved in intranuclear packaging of preribosomal 
particles. Shuttles between nucleus and cytoplasm. 


KTEREAEKALEEKX7R (SEQ ID NO:566) 
Kxf KX7KX4IOG EDTTEETLR (SEQ ID NO:567) 
RG2RG2RG3RG2FG2RG3RGFG2RG3FRG2RG4 
DHKPQGKKIKFE (SEQ ID NO:568) 
(C-tenninus) 


Chicken, hamster nucleolin (92 kD). Binds 
preribosomal RNA. Shuttles between nucleus and 
cytoplasm. 




numan 1d1 yfos aa) wnicn binds selectively to 
s\ 1 -ncn ivi/vivS wiin nuxeo 

A, T, C on one strand excluding G. Binds to minor 
groove with little contact with bases. 


QKKKQMKAD (SEQ ID NO:570) 
(KKEKKE)s (SEQ ID NO:571) 
JsJUiKJsJ<Js.bJiD (bfc-Q ID NO: 572) 
EEKKSKKSKK (SEQ ID NO:573) 


yeast CBF5p, a centromere-binding protein 
(55kDa, 483aa). The KKE repeat at its C-tenninus 
occurs in microtubule-binding domains; yeast cells 
containing only three copies of the KKE repeat of 
v^Drjp ueiay at vj2/ivi, uepieuon 01 v^rj-cop arrests 
cells at Gi/S. 


TKKKSFKL (SEQ ID NO:574) 


yeast CCE1, a cruciform cutting endonuclease 


KSERERMLRESLKEERRRF (SEQ ID NO:575) 


rat nucieoporin 155 orNupl55 (1390 aas, 155 
kDa), a protein of the nuclear pore complex; 
contains 46 consensus sites for various kinases; 
associated with both the nucleoplasms and the 
cytoplasmic region of pores. 


PKKGSKKA (SEQ ID NO:576) 
DGKKRKRSRKES (SEQ ID NO:577) 


human H2B variant differentially expressed during 
the cell cycle 


GAKRHRKVLRD fSEO ID NO:578^ 
14-24 

PAIRRLARRG (SEQ ID NO:579) 
32-41 

bHARRKT (SEQ ID NO:580) 
74-80 


Calf thymus histone H4 
(102 aa) 


AKKJKCjJBRA 127-135 (SEQ ID NO:581) 


Calf thymus H3 
(135 aa) 


UbHHKAKGK 121-129 (SEQ ID NO: 5 82) 


Calf thymus H2A 
(129 aa) 


RGKSGKARTKAKSRSSR 3-19 (SEQ ID 
NO:583) 


Sea urchin Psammechinus miliaris H2A (123 aa) 


PKKGSKKA 10-17 (SEQ ID NO:584) 
OKKDGKKRKRSRKF9 ?9-l£ f<2FO TT> MH^R^ 


Calf thymus H2B 

\ I ZD aa) 


GGKKRHRKRKGSY fSEO ID NO: 586* 
22-34 


Sea urchin Psammechinus miliaris H2B (122 aa) 


PRTDKKRRRKRKES 19-32 fSEO ID NO:587* 


Starfish H2B 
(121 aa) 


PAKAPKKKA 12-20 (SEQ ID NO:588) 
EAKKPAKKA 104-112 (SEQ ID NO:589) 
AKKPKKV 128-134 (SEQ ID NO:590) 
AKKSPKKAKKP 142-152 (SEQ ID NO:591) 
PKKVKKP 183-189 (SEQ ID NO:592) 


Trout testis HI 
(194 aa) 
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Putative NLS 


. Protein 


PRRKAKRA 30-37 (SEQ ID NO:593) 
PKKAKKT 1 19-125 (SEQ ID NO:594) 
AKAKKAKA 129-136 (SEQ ID NO:595) 
AKKARKAKA 139-147 (SEQ ID NO:596) 
AKKAKKPKKKA 171-181 fSEO ID NO:597) 
AKKAKKPAKK 182-191 (SEQ ID NO: 598) 
SPKKAKKP 192-199 (SEQ ID NO:599) 
AKKSPKKKKAKRS 200-212 fSEO ID NO:600} 
PKKAKKA 213-219 (SEQ ID NO:601) 
AKKAKKS 227-233 (SEQ ED NO:602) 
PRKAGKRRSPKKARK 234-248 (SEQ ID 
NO:603) 


Sea urchin Parechinus angidosus sperm HI (248 
aa) 


ARRRKTA 1-7 (SEQ ID NO:604) 
IRKFIRKA 55-61 (SEQ ID NO:605) 
PKKKKA 83-88 (SEQ ID NO:606) 
AKKPKAKKVKKP 89-100 (SEQ ID NO:607) 
AKKKTNRARKPKTKKNR 104-120 (SEQ ID 


Annelid sperm HI a 
(119aa) 


PKJRKVSS 1-7 (SEQ ID NO:609) 
EEPKRRSARLS 14-24 (SEQ ID NO:610) 


Calf thymus HMG14 
(100 aa) 


PKGKKGKA 52-59 (SEQ ID NO:612) 


Calf thvmim HMfi17 

V-^dll Lily illLla XTJ.V1VJ l / 

(89aa; 9,247 D) 


PKKPRGKM (SEQ ID NO:613) 
EHKKKHP (SEQ ID NO: 6 14) 
ETKKKFKDP (SEQ ID NO:615) 
EKSKKKKf E/D)4 j (SEQ ID NO:616) 
E3 G?KKKKKFAK (SEQ ID NO:617) 


Calf thymus HMG 1 
(259 aa) 


EHKKKHP (SEQ ID NO:618) 
PKGDKKGKKKDP fSEO ID NO:619} 
E4G^KKK#KFAK ( SEO ID NO:620^ 


Calf thymus HMG 2 
(256 aa) 


PKRKSATKGDEPARR 1-15 (SEQ ID NO:621) 
KPKKAAAPKKA 30-34 (SEQ ID NO:622) 


Trout testis H6 (60 aa) 
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Claims 

What is claimed is: 

1 . A method for producing micelles with entrapped therapeutic agents, 
5 comprising: 

a) combining an effective amount of a negatively charged 
therapeutic agent with an effective amount of a cationic lipid in 
a ratio where about 30% to about 90% the negatively charged 
atoms are neutralized by positive charges on lipid molecules to 

10 form an electrostatic micelle complex in about 20% to about 

80% ethanol; and 

b) combining the micelle complex of step a) with an effective 
amount of a fusogenic-karyophilic peptide conjugates in a ratio 
range of about 0.0 to about 0.3, thereby producing micelles 

1 5 with entrapped therapeutic agents. 

2. The method of claim 1 , wherein the negatively charged therapeutic 
agent is a therapeutic agent selected from the group consisting of a polynucleotide 
and a negatively charged drug. 

20 

3. The method of claim 2, wherein the polynucleotide is a DNA 
polynucleotide or an RNA polynucleotide. 

4. The method of claim 2, wherein the polynucleotide is a DNA 
25 polynucleotide. 

5. The method of claim 4, wherein the DNA polynucleotide comprises 
plasmid DNA. 

30 6. The method of claim 1, further comprising combining an effective 

amount of an anionic lipid in step a). 
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7. The method of claim 6, wherein the anionic lipid is dipalmitoyl 
phosphatidyl glycerol (DDPG) or a derivative thereof. 

8. The method of claim 4, further comprising combining an effective 

5 amount of a DNA condensing agent selected from the group consisting of spermine, 
spermidine, polylysine, polyarginine, polyhistidine, polyornithine and magnesium or 
a divalent metal ion. 

9. The method of claim 5, wherein the plasmid DNA comprises a 

10 sequence encoding p53, HSV-tk, p21, Bax, Bad, IL-2, IL-12, GM-CSF, angiostatin, 
endostatin and oncostatin. 

10. The method of claim 1 , wherein the cationic lipids are selected from 
the group consisting of 3p-(N-(N f ,N f -dimethylaminoethane)carbamoyl)cholesterol, 

15 dimethyldioctadecyl ammonium bromide (DDAB), N-[l-(2,3- 

dimyristyloxy)propyl]-N,N-dimethyI-N-(2-hydroxyethyl) ammonium bromide 
(DMRIE), l,2-dimyristoyl-3-trimethylammonium propane (DMTAP), 
dioctadecylamidoglycy lspermine (DOGS), N-( 1 -(2,3 -dioleoyloxy)propyl)-N,N,N- 
trimethylammonium chloride (DOTMA), 1,2- dipalmitoyl-3-trimethylammonium 

20 propane (DPTAP), l,2-disteroyl-3-trimethylammonium propane (DSTAP). 

1 1 . The method of claim 10, wherein the cationic lipids are combined 
with the fusogenic lipid DOPE in a molar ratio from about 1:1 to about 2:1. 

25 12. The method of claim 11, wherein the cationic lipids are combined 

with the fusogenic lipid DOPE in a molar ratio of 1 : 1 . 

13. The method of claim 1, wherein the fusogenic-karyophilic peptide is 
an NLS peptide. 

30 

14. The method of claim 13, wherein the NLS peptide is a peptide 
selected from the group consisting of Seq. ID Nos. 20 -622. 
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15. The method of claim 1, wherein the fusogenic-karyophilic peptide 
conjugate is a sole fusogenic peptide. 

5 16. The method of claim 1 , wherein the NLS peptide component of the 

fusogenic-karyophilic peptide conjugate is an NLS peptide selected from the group 
consisting of Seq. ID Nos. 20-622. 

17. The method of claim 1, wherein the fusogenic/NLS peptide 

10 conjugates comprise amino acid sequences selected from the group consisting of 

(KAWLKAF) 3 (SEQ ID NO:l), GLFKAAAKLLKSLWKLLLKA (SEQ ID NO:2), 
LLLKAFAKLLKSLWKLLLKA (SEQ ID NO:3) as well as all derivatives of the 
prototype (Hydrophobic 3 KaryophiliciHydrophobic2Karyophilici) 2 -3 where 
Hydrophobic is any of the A, I, L, V, P, G, W, F and Karyophilic is any of the K, R, 

15 or H, containing a positively-charged residue every 3rd or 4th amino acid, that form 
alpha helices and direct a net positive charge to the same direction of the helix. 

18. The method of claim 1, wherein the fusogenic/NLS peptide conjugate 
comprise an amino acid sequence selected from the group consisting of 

20 GLFKAIAGFIKNGWKGMIDGGGYC (SEQ ID NO:4) from influenza virus 

hemagglutinin HA-2 and YGRKKRRQRRR (SEQ ID NO:5) from TAT of HIV. 

19. The method of claim 1, wherein the fusogenic/NLS peptide conjugate 
comprise an amino acid sequence selected from the group consisting of 

25 MSGTFGGILAGLIGLL(K/R/H)i_ 6 (SEQ ID NO:6), derived from the N-terminal 
region of the S protein of duck hepatitis B virus but with the addition of one to six 
positively-charged lysine, arginine or histidine residues, and combinations of these, 
GAAIGLAWIPYFGPAA (SEQ ID NO:7) derived from the fusogenic peptide of the 
Ebola virus transmembrane protein; residues 53-70 (C-terminal helix) of 

30 apolipoprotein (apo) All peptide, the 23-residue fusogenic N-terminal peptide of 
HIV-1 transmembrane glycoprotein gp41, the 29-42-residue fragment from 
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Alzheimer's beta-amyloid peptide, the fusion peptide and N-terminal heptad repeat 
of Sendai virus, the 56-68 helical segment of lecithin cholesterol acyltransferase. 

20. The method of any of claim 13 to 19, wherein the NLS peptide 

5 component in fusogenic/NLS peptide conjugates are synthetic peptides containing 
the above said NLS but further modified by additional K, R, H residues at the central 
part of the peptide or with P or G at the N- or C-terminus. 

2 1 . The method of claim 13, wherein the fusogenic peptide/NLS peptide 
10 conjugates are linked to each other with a short amino acid stretch representing an 

endogenous protease cleavage site. 

22. The method of claim 1, wherein the structure of the preferred 
prototype fusogenic/NLS peptide conjugate used in this invention is: 

15 PKKRRGPSP(L/A/I) 12 -20 (SEQ ID NO:8) where (L/A/I) 12 - 2 o is a stretch of 12-20 * 
hydrophobic amino acids containing A, L, I, Y, W, F and other hydrophobic amino 
acids. 

23. The method of claim 1, wherein the fusogenic/NLS peptide 

20 conjugates are added to the mixture of DNA/cationic lipid and are incorporated into 
micelles. 

24. The method of claim 1, further comprising combining an effective 
amount of an encapsulating lipid solution to step b). 

25 

25. The method of claim 24, wherein the encapsulating lipid is a lipid 
comprising cholesterol (40%), dioleoylphosphatidylethanolamine (DOPE) (20%), 
palmitoyloleoylphosphatidylcholine (POPC) (12%), hydrogenated soy 
phosphatidylcholine (HSPC) (10%), distearoylphosphatidylethanolamine (DSPE) 

30 (10%), sphingomyelin (SM) (5%), and derivatized vesicle-forming lipid M-PEG- 
DSPE (3%). 
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26. The method of claim 24, wherein the encapsulating lipid is a 
liposome. 

27. The method of claim 26, wherein the liposomes comprises vesicle- . 
5 forming lipids and between about 1 to about 7 mole percent of 

distearoylphosphatidyl ethanolamine (DSPE) derivatized with an effective amount of 
polyethyleneglycol. 

28. The method of claim 27, wherein the liposomes have a selected 
10 average size of about 80 to about 160 ran. 

29. The method of claim 27, wherein the polyethyleneglycol has a 
molecular weight from about 1,000 to about 5,000 daltons. 

15 30. A micelle with an entrapped therapeutic agent produced by the 

method of claim 1. 

31. A liposome encapsulated therapeutic agent produced by the method of 
claim 24. 

20 

32. The method of claim 3 1 , wherein the therapeutic agent further 
comprises regulation by a liver, spleen or bone marrow regulatory DNA sequence. 

33. The method of claim 32, wherein the regulatory DNA sequence is 
25 nuclear matrix DNA isolated from liver, spleen or bone marrow cells. 

34. A method for delivering a therapeutic agent in vivo, comprising 
administration of an effective amount of the micelle of claim 30 to a subject. 

30 35. The method of claim 34, wherein the therapeutic agent further 

comprises regulation by a tumor-specific regulatory DNA sequence. 
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36. The method of claim 35, wherein the tumor-specific regulatory 
sequence is nuclear matrix DNA isolated from specific tumor cells. 

37. A method for delivering a therapeutic agent in vrvo, comprising 

5 administration of an effective amount of the liposome encapsulated agent of claim 
31 to the subject. 

38. The method of claims 34 or 37, wherein the administration is 
intravenous administration or by injection. 

10 

39. A micelle with an entrapped DNA polynucleotide produced by the 
method of claim 9. 

40. A method for reducing tumor size in a subject comprising 

15 administration of an effective amount of the micelle of claim 39 to the subject. 

41 . The method of claim 40, further comprising administration of an 
effective amount of a second therapeutic agent, wherein the agent is selected from 
the group consisting of ganciclovir, 5-fluorocytosine, an antisense oligonucleotides a 

20 ribozyme, and a triplex-forming oligonucleotide directed against genes that control 
the cell cycle or signaling pathways. 

42. The method of claim 41, further comprising administration of an 
effective amount of a second therapeutic agent, wherein the second therapeutic agent 

25 is selected from the group consisting of adriamycin, angiostatin, azathioprine, 
bleomycin, busulfane, camptothecin, carboplatin, carmustine, chlorambucile, 
chlormethamine, chloroquinoxaline sulfonamide, cisplatin, cyclophosphamide, 
cycloplatam, cytarabine, dacarbazine, dactinomycin, daunorubicin, didox, 
doxorubicin, endostatin, enloplatin, estramustine, etoposide, extramustinephosphat, 

30 flucytosine, fluorodeoxyuridine, fluorouracil, gallium nitrate, hydroxyurea, 
idoxuridine, interferons, interleukins, leuprolide, lobaplatin, lomustine, 
mannomustine, mechlorethamine, mechlorethaminoxide, melphalan, 
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mercaptopurine, methotrexate, mithramycin, mitobronitole, mitomycin, 
mycophenolic acid, nocodazole, oncostatin, oxaliplatin, paclitaxel, pentamustine, 
platinum-triamine complex, plicamycin, prednisolone, prednisone, procarbazine, 
protein kinase C inhibitors, puromycine, semustine, signal transduction inhibitors, 
5 spiroplatin, streptozotocine, stromelysin inhibitors, taxol, tegafur, telomerase 

inhibitors, teniposide, thalidomide, thiamiprine, thioguanine, thiotepa, tiamiprine, 
tretamine, triaziquone, trifosfamide, tyrosine kinase inhibitors, uramustine, 
vidarabine, vinblastine, vinca alcaloids, vincristine, vindesine, vorozole, zeniplatin, 
zeniplatin, and zinostatin. 
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The search profile you entered was too complex or gave too many 
answers. Simplify or subdivide the query and try again. If you have 
exceeded the answer limit, enter DELETE HISTORY at an arrow prompt 
(=>) to remove all previous answers sets and begin at LI. Use the 
SAVE command to store any important profiles or answer sets before 
using DELETE HISTORY. 

=> s [fyw] [stnq] | [stnq] [fyw]/SQSP and sql<=15 
193285 [FYW] [STNQ] | [STNQ] [FYW] /SQSP 
1550662 SQL<=15 
L2 193285 [FYW] [STNQ] | [STNQ] [FYW] /SQSP AND SQL<=15 

=> s L2 and 2001/ED 

6766126 2001/ED 

(20010000-20019999/ED) 
L3 16945 L2 AND 2001/ED 

=> d L3 

L3 ANSWER 1 OF 1694 5 REGISTRY COPYRIGHT 2 008 ACS on STN 

RN 379722-40-4 REGISTRY 

ED Entered STN: 31 Dec 2001 

CN L-Tyrosine, glycylglycyl-L-lysyl-L-lysyl-L-arginyl-L-histidyl-L-arginyl-L- 

lysyl-L-arginyl-L-lysylglycyl-L-seryl- (9CI) (CA INDEX NAME) 
OTHER NAMES: 

CN 588: PN : WO0193836 SEQID: 586 claimed protein 
FS PROTEIN SEQUENCE; STEREOSEARCH 
MF C66 H116 N28 016 
SR CA 

LC STN Files: CA, CAPLUS r TOXCENTER, USPATFULL 
Absolute stereochemistry. 
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to/ V \/ \ / 

H2N II g fCH 2 )4 
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*■* PROPERTY DATA AVAILABLE IN THE T PROP 1 FORMAT * * 

1 REFERENCES IN FILE CA (1907 TO DATE) 

1 REFERENCES IN FILE CAPLUS (1907 TO DATE) 

=> 

=> d L3 2-20 

RN 379722-30-2 'USsrS"™ C ° PYRIGHT 2008 ACS ™ S ™ 
ED Entered STN: 31 Dec 2001 

OTHER NAMES: 

CN 576: PN: WO0193836 SEQID: 574 claimed protein 
FS PROTEIN SEQUENCE; STEREOSEARCH 
MF C46 H82 N12 Oil 
SR CA 

LC STN Files: CA, CAPLUS, TOXCENTER, US PAT FULL 
Absolute stereochemistry. 




**PROPERTY DATA AVAILABLE IN THE ' PROP 1 FORMAT * * 



1 REFERENCES IN FILE CA (1907 TO DATE) 

1 REFERENCES IN FILE CAPLUS (1907 TO DATE) 

L3 ANSWER 3 OF 16945 REGISTRY COPYRIGHT 2008 ACS on STN 

RN 379722-10-8 REGISTRY 

ED Entered STN: 31 Dec 2001 

CN L-Alanine, L-tyrosyl-L-methionyl-L-arginyl-L-arginyl-L-arginyl-L-threonyl- 

L-tyrosyl-L-arginyl- (9CI) (CA INDEX NAME) 
OTHER NAMES: 

CN 547: PN: WO0193836 SEQID: 545 claimed protein 
FS PROTEIN SEQUENCE; STEREOSEARCH 
MF C54 H89 N21 013 S 
SR CA 

LC STN Files: CA, CAPLUS , TOXCENTER, USPATFULL 
Absolute stereochemistry. 
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* PROPERTY DATA AVAILABLE IN THE 'PROP* FORMAT * * 

1 REFERENCES IN FILE CA (1907 TO DATE) 



1 REFERENCES IN FILE CAPLUS (1907 TO DATE) 



L3 ANSWER 4 OF 16945 REGISTRY COPYRIGHT 2008 ACS on STN 

RN 379722-08-4 REGISTRY 

ED Entered STN: 31 Dec 2001 

CN L-Arginine, L-prolyl-L-lysyl-L-methionyl-L-arginyl-L-arginyl-L-arginyl-L- 

threonyl-L-tyrosyl- (9CI) (CA INDEX NAME) 
OTHER NAMES: 

CN 545: PN: WO0193836 SEQID: 543 claimed protein 
FS PROTEIN SEQUENCE; STEREOSEARCH 
MF C53 H94 N22 012 S 
SR CA 

LC STN Files: CA, CAPLUS, TOXCENTER, US PAT FULL 
Absolute stereochemistry. 
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**PROPERTY DATA AVAILABLE IN THE ' PROP 1 FORMAT** 

1 REFERENCES IN FILE CA (1907 TO DATE) 

1 REFERENCES IN FILE CAPLUS (1907 TO DATE) 



L3 
RN 
ED 
CN 



ANSWER 5 OF 16945 REGISTRY COPYRIGHT 2008 ACS on STN 
37 9721-09-2 REGISTRY 
Entered STN: 31 Dec 2001 

L-Phenylalanine, L-prolyl-L-lysyl-L-lysyl-L-prolyl-L-arginyl-L-histidvl-L- 
glutaminyl- (9CI) (CA INDEX NAME) 
OTHER NAMES: 

CN 440: PN: WO0193836 SEQID: 438 claimed protein 



FS PROTEIN SEQUENCE; STEREOSEARCH 
MF C48 H76 N16 010 
SR CA 

LC STN Files: CA, CAPLUS, TOXCENTER, USPATFULL 
Absolute stereochemistry. 
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** PROPERTY DATA AVAILABLE IN THE ' PROP 1 FORMAT** 

1 REFERENCES IN FILE CA (1907 TO DATE) 

1 REFERENCES IN FILE CAPLUS (1907 TO DATE) 

L3 ANSWER 6 OF 16945 REGISTRY COPYRIGHT 2008 ACS on STN 

RN 379721-03-6 REGISTRY 

ED Entered STN : 31 Dec 2001 

CN L-Tyrosine, L-leucyl-L-arginyl-L-arginyl-L-arginylglycyl-L-arginyl-L- 

glutaminyl-L-threonyl- (9CI) (CA INDEX NAME) 
OTHER NAMES: 

CN 434: PN: WO0193836 SEQID: 432 claimed protein 
FS PROTEIN SEQUENCE; STEREOSEARCH 
MF C50 H88 N22 013 
SR CA 

LC STN Files: CA, CAPLUS, TOXCENTER, USPATFULL 



Absolute stereochemistry. 
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**PROPERTY DATA AVAILABLE IN THE ' PROP 1 FORMAT** 

1 REFERENCES IN FILE CA (1907 TO DATE) 

1 REFERENCES IN FILE CAPLUS (1907 TO DATE) 

L3 ANSWER 7 OF 16945 REGISTRY COPYRIGHT 2008 ACS on STN 

RN 379721-00-3 REGISTRY 

ED Entered STN: 31 Dec 2001 

CN L-Asparagine, L-isoleucyl-L-a-glutamyl-L-arginyl-L-arginyl-L-arginyl- 

L-arginyl-L-phenylalanyl- (9CI) (CA INDEX NAME) 
OTHER NAMES.: 

CN 431: PN: WO0193836 SEQID: 429 claimed protein 
FS PROTEIN SEQUENCE; STEREOSEARCH 
MF C48 H83 N21 012 
SR CA 

LC STN Files: CA, CAPLUS, TOXCENTER, US PAT FULL 
Absolute stereochemistry. 





**PROPERTY DATA AVAILABLE IN THE ' PROP 1 FORMAT * * 

1 REFERENCES IN FILE CA (1907 TO DATE) 

1 REFERENCES IN FILE CAPLUS (1907 TO DATE) 

L3 ANSWER 8 OF 16945 REGISTRY COPYRIGHT 2008 ACS on STN 

RN 379719-67-2 REGISTRY 

ED Entered STN: 31 Dec 2001 

CN L-Glutamine, L-asparaginyl-L-leucyl-L-arginyl-L-lysyl-L-lysyl-L-isoleucyl- 
L-lysyl-L-seryl-L-phenylalanyl-L-asparaginyl-L-lysyl-L-leucyl- ( 9CI ) (CA 
INDEX NAME) 

OTHER NAMES: 

CN 286: PN: WO0193836 SEQID: 284 claimed protein 
FS PROTEIN SEQUENCE/ STEREOSEARCH 
MF C73 H129 N23 018 
SR CA 

LC STN Files: CA, CAPLUS, TOXCENTER, US PAT FULL 
Absolute stereochemistry. 
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1 REFERENCES IN FILE CA (1907 TO DATE) 

1 REFERENCES IN FILE CAPLUS (1907 TO DATE) 



L3 ANSWER 9 OF 16945 REGISTRY COPYRIGHT 2008 ACS on STN 

RN 379719-15-0 REGISTRY 

ED Entered STN: 31 Dec 2001 

CN L-Asparagine, L-tyrosyl-L-leucyl-L-arginyl-L-arginyl-L-alanyl-L-methionyl- 

L-lysyl-L-arginyl-L-phenylalanyl- (9CI) (CA INDEX NAME) 
OTHER NAMES: 

CN 231: PN: WO0193836 SEQID: 229 claimed protein 
FS PROTEIN SEQUENCE; STEREOSEARCH 
MF C60 H99 N21 013 S 
SR CA 

LC STN Files: CA, CAPLUS, TOXCENTER, USPATFULL 
Absolute stereochemistry. 
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**PROPERTY DATA AVAILABLE IN THE ' PROP 1 FORMAT** 

1 REFERENCES IN FILE CA (1907 TO DATE) 

1 REFERENCES IN FILE CAPLUS (1907 TO DATE) 

L3 ANSWER 10 OF 16945 REGISTRY COPYRIGHT 2008 ACS on STN 

RN 379719-05-8 REGISTRY 

ED Entered STN: 31 Dec 2001 

CN L-Leucine, L-a-glutamyl-L-phenylalanyl-L-threonyl-L-lysyl-L-arginyl- 

L-arginyl-L-arginyl-L-threonyl- (9CI) (CA INDEX NAME) 
OTHER NAMES: 

CN 218: PN: WO0193836 SEQID: 216 claimed protein 
FS PROTEIN SEQUENCE; STEREOSEARCH 



MF C52 H91 N19 014 
SR CA 

LC STN Files: CA, CAPLUS, TOXCENTER, USPATFULL 
Absolute stereochemistry. 
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C02H 



**PROPERTY DATA AVAILABLE IN THE r PROP 1 FORMAT** 

1 REFERENCES IN FILE CA (1907 TO DATE) 

1 REFERENCES IN FILE CAPLUS (1907 TO DATE) 

L3 ANSWER 11 OF 16945 REGISTRY COPYRIGHT 2008 ACS on STN 

RN 379719-02-5 REGISTRY 

ED Entered STN: 31 Dec 2001 

CN L-Methionine, L- ty rosy 1-L-valy 1-L-alanyl-L-i sol eucyl-L-lysyl-L- threony 1-L- 
lysyl-L-lysyl-L-arginyl-L-isoleucyl-L-leucyl-L-leucyl-L-tyrosyl-L-threonyl- 
(9CI) (CA INDEX NAME ) 
OTHER NAMES: 

CN 215: PN: WO0193836 SEQID: 213 claimed protein 
FS PROTEIN SEQUENCE; STEREOSEARCH 
MF C87 H149 N21 O20 S 
SR CA 

LC STN Files: CA, CAPLUS, TOXCENTER, USPATFULL 



Absolute stereochemistry. 
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OH 



**PROPERTY DATA AVAILABLE IN THE 1 PROP ' FORMAT * * 

1 REFERENCES IN FILE CA (1907 TO DATE) 

1 REFERENCES IN FILE CAPLUS (1907 TO DATE) 

L3 ANSWER 12 OF 16945 REGISTRY COPYRIGHT 2008 ACS on STN 

RN 379718-85-1 REGISTRY 

ED Entered STN: 31 Dec 2001 

CN L-Proline, L~lysyl-L-tyrosyl-L-alanyl-L-valyl-L-lysyl-L-lysyl-L-leucyl-L- 
lysyl-L-valyl-L-lysyl-L-phenylalanyl-L-serylglycyl- (9CI) (CA INDEX NAME) 
OTHER NAMES: 

CN 196: PN: WO0193836 SEQID: 194 claimed protein 
FS PROTEIN SEQUENCE; STEREOSEARCH 
MF C77 H129 N19 017 
SR CA 

LC STN Files: CA, CAPLUS , TOXCENTER, USPATFULL 
Absolute stereochemistry. 
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**PROPERTY DATA AVAILABLE IN THE 'PROP' FORMAT* * 

1 REFERENCES IN FILE CA (1907 TO DATE) 
.1 REFERENCES IN FILE CAPLUS (1907 TO DATE) 

L3 ANSWER 13 OF 16945 REGISTRY COPYRIGHT 2008 ACS on STN 

RN 379718-83-9 REGISTRY 

ED Entered STN: 31 Dec 2001 

CN L-Aspartic acid, L-prolyl-L-alanyl-L-glutaminyl-L-lysyl-L-leucyl-L-arginyl- 
L-lysyl-L-lysyl-L-asparaginyl-L-asparaginyl-L-phenylalanyl- ( 9CI ) (CA 
INDEX NAME) 

OTHER NAMES: 

CN 194: PN: WO0193836 SEQID: 192 claimed protein 
FS PROTEIN SEQUENCE; STEREOSEARCH 
MF C64 H107 N21 018 
SR CA 

LC STN Files: CA, CAPLUS, TOXCENTER, USPATFULL 
Absolute stereochemistry. 
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**PROPERTY DATA AVAILABLE IN THE 'PROP' FORMAT** 

1 REFERENCES IN FILE CA (1907 TO DATE) 

1 REFERENCES IN FILE CAPLUS (1907 TO DATE) 

L3 ANSWER 14 OF 16945 REGISTRY COPYRIGHT 2008 ACS on STN 

RN 379718-51-1 REGISTRY 

ED Entered STN: 31 Dec 2001 

CN Glycine, L-a-glutamyl-L-leucyl-L-arginyl-L-glutaminyl-L-phenylalanyl- 

L-histidyl-L-arginyl-L-arginyl-L-seryl-L-leucyl- (9CI) (CA INDEX NAME) 
OTHER NAMES: 

CN 159: PN: WO0193836 SEQID: 157 claimed protein 
FS PROTEIN SEQUENCE; STEREOSEARCH 
MF C60 H99 N23 016 
SR CA 

LC STN Files: CA, CAPLUS , TOXCENTER, US PAT FULL 



Absolute stereochemistry. 
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**PROPERTY DATA AVAILABLE IN THE ' PROP T FORMAT* * 

1 REFERENCES IN FILE CA (1907 TO DATE) 

1 REFERENCES IN FILE CAPLUS (1907 TO DATE) 

L3 ANSWER 15 OF 16945 REGISTRY COPYRIGHT 2008 ACS on STN 

RN 379718-38-4 REGISTRY 

ED Entered STN: 31 Dec 2001 

CN Glycine, glycyl-L-phenylalanyl-L-alanyl-L-lysyl-L-arginyl-L-valyl-L 
lysylglycyl-L-arginyl-L-threonyl-L-tryptophyl-L-threonyl-L-leucyl-L 
cysteinyl- (9CI) (CA INDEX NAME) 

OTHER NAMES: 

CN 142: PN: WO0193836 SEQID: 140 claimed protein 
FS PROTEIN SEQUENCE; STEREOSEARCH 
MF C75 H122 N24 018 S 
SR CA 

LC STN Files: CA, CAPLUS , TOXCENTER, USPATFULL 



Absolute stereochemistry. 
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* PROPERTY DATA AVAILABLE IN THE 1 PROP 1 FORMAT** 



1 REFERENCES IN FILE CA (1907 TO DATE) 

1 REFERENCES IN FILE CAPLUS (1907 TO DATE) 



L3 ANSWER 16 OF 16945 REGISTRY COPYRIGHT 2008 ACS on STN 

RN 379718-30-6 REGISTRY 

ED Entered STN: 31 Dec 2001 

CN L-Threonine, L-seryl-L-tyrosyl-L-valyl-L-valyl-L-histidyl-L-lysyl-L- 
arginyl-L-cysteinyl-L-histidyl-L-a-glutamyl-L-tyrosyl-L-valyl- ( 9CI) 
(CA INDEX NAME) 

OTHER NAMES: 

CN 132: PN: WO0193836 SEQID: 130 claimed protein 
FS PROTEIN SEQUENCE; STEREOSEARCH 
MF C72 H109 N21 O20 S 
SR CA 

LC STN Files: CA, CAPLUS, TOXCENTER, US PAT FULL 
Absolute stereochemistry. 
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CO2H O 
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OH 



**PROPERTY DATA AVAILABLE IN THE ' PROP 1 FORMAT * * 



1 REFERENCES IN FILE CA (1907 TO DATE) 

1 REFERENCES IN FILE CAPLUS (1907 TO DATE) 

L3 ANSWER 17 OF 16945 REGISTRY COPYRIGHT 2008 ACS on STN 

RN 379717-85-8 REGISTRY 

ED Entered STN: 31 Dec 2001 

CN L-Lysine, L-arginyl-L-lysyl-L-phenylalanyl-L-lysyl-L-lysyl-L-phenylalanyl- 

L-asparaginyl- (9CI) (CA INDEX NAME) 
OTHER NAMES: 

CN 58: PN: WO019383.6 SEQID: 56 claimed protein 

CN 66: PN: WO2006042214 SEQID: 30 unclaimed sequence 

FS PROTEIN SEQUENCE; STEREOSEARCH 

MF C52 H86 N16 O10 

SR CA 

LC STN Files: CA, CAPLUS , TOXCENTER, US PAT FULL 
Absolute stereochemistry. 
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(CH 2 ) 4 




O NH 



** PROPERTY DATA AVAILABLE IN THE 1 PROP * FORMAT** 

2 REFERENCES IN FILE CA (1907 TO DATE) 

2 REFERENCES IN FILE CAPLUS (1907 TO DATE) 

L3 ANSWER 18 OF 16945 REGISTRY COPYRIGHT 2008 ACS on STN 

RN 379711-25-8 REGISTRY 

ED Entered STN: 31 Dec 2001 

CN L-Leucine, L-seryl-L-phenylalanyl-L-asparaginyl-L-seryl-L- tyros yl-L- 

a-glutamyl-L-leucylglycyl-L-seryl- (CA INDEX NAME) 
OTHER NAMES: 

CN 15: PN: US20060148700 SEQID: 15 claimed protein 
CN 1: PN: US20060148702 SEQID: 1 claimed sequence 



CN 1: PN: WO2005099721 SEQID: 1 claimed protein 

CN 1: PN: WO2006080941 SEQID: 1 claimed sequence 

CN 1: PN: WO2007027974 SEQID: 1 claimed protein 

CN 33: PN: US20060153867 SEQID: 34 claimed sequence 

CN 3: PN: WO2005107789 SEQID: 3 claimed sequence 

CN 4: PN: WO2007143119 SEQID: 4 unclaimed sequence 

CN 8-17-5 protein kinase C (Rattus norvegicus isoform 6V1-1) 

(Rattus norvegicus) 

CN 8-17-Kinase (phosphorylating ) , protein, nPKC (Rattus norvegicus) 

FS PROTEIN SEQUENCE; STEREOSEARCH 

MF C50 H73 Nil 018 

SR CA 

LC STN Files: CA, CAPLUS, TOXCENTER, USPAT2, US PAT FULL 
Absolute stereochemistry. 
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~^Bu-i 



**PROPERTY DATA AVAILABLE IN THE 'PROP' FORMAT*"* 

15 REFERENCES IN FILE CA (1907 TO DATE) 

1 REFERENCES TO NON-SPECIFIC DERIVATIVES IN FILE CA 
15 REFERENCES IN FILE CAPLUS (1907 TO DATE) 

L3 ANSWER 19 OF 16945 REGISTRY COPYRIGHT 2008 ACS on STN 

RN 379705-46-1 REGISTRY 

ED Entered STN: 31 Dec 2001 

CN L-Tyrosine, L-methionyl-L-a-glutamyl-L-cysteinylglycyl-L-glutaminyl- 

L-methionyl-L-seryl-L-phenylalanyl-L-lysyl-L-asparaginyl-L-isoleucyl-L- 
tyrosyl-L-histidyl-L-lysyl- (9CI) (CA INDEX NAME) 

OTHER NAMES: 



CN 9: PN; WO0192328 SEQID: 7 unclaimed sequence 
FS PROTEIN SEQUENCE; STEREOSEARCH 
MF C83 H123 N21 023 S3 
SR CA 

LC STN Files: CA, CAPLUS, TOXCENTER 
Absolute stereochemistry. 
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**PROPERTY DATA AVAILABLE IN THE * PROP ' FORMAT** 

1 REFERENCES IN FILE CA (1907 TO DATE) 

1 REFERENCES IN FILE CAPLUS {1907 TO DATE) 

L3 ANSWER 20 OF 16945 REGISTRY COPYRIGHT 2008 ACS on STN 

RN 379700-35-3 REGISTRY 

ED Entered STN: 31 Dec 2001 

CN L-Methionine, L-methionyl-L-a-aspartyl-L-threonyl-L-phenylalanyl-L- 
prolyl-L-histidyl-L-valyl-L-leucyl-L-cysteinylglycyl-L-histidyl-L- 
cysteinyl-L-phenylalanyl-L-tryptophyl- (9CI) (CA INDEX NAME) 

OTHER NAMES: 

CN 1: PN: WO0192517 SEQID: 7 unclaimed sequence 
FS PROTEIN SEQUENCE; STEREOSEARCH 
MF C83 HI 14 N20 019 S4 
SR CA 

LC STN Files: CA, CAPLUS , TOXCENTER 
Absolute stereochemistry. 
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NH2 
C02H 



**PROPERTY DATA AVAILABLE IN THE 'PROP* FORMAT * * 

1 REFERENCES IN FILE CA (1907 TO DATE) 

1 REFERENCES IN FILE CAPLUS (1907 TO DATE) 



=> logh 

LOGH IS NOT A RECOGNIZED COMMAND 

The previous command name entered was not recognized by the system. 
For a list of commands available to you in the current file, enter 
"HELP COMMANDS" at an arrow prompt (=>) . 

=> log h 

COST IN U.S. DOLLARS SINCE FILE TOTAL 

ENTRY SESSION 

FULL ESTIMATED COST 86.22 86.43 

SESSION WILL BE HELD FOR 120 MINUTES 
STN INTERNATIONAL SESSION SUSPENDED AT 22:02:51 ON 20 FEB 2008 



